添加链接
link管理
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接

在huggingface上传数据集后,Dataset Viewer无法显示,报错:
The dataset viewer is not available for this split.

Cannot extract the features (columns) for the split 'train' of the config 'default' of the dataset.
Error code:   FeaturesError
Exception:    ValueError
Message:      Not able to read records in the JSON file at hf://datasets/xxx/train.json.
Traceback:    Traceback (most recent call last):
                File "/src/services/worker/src/worker/job_runners/split/first_rows.py", line 243, in compute_first_rows_from_streaming_response
                  iterable_dataset = iterable_dataset._resolve_features()
                File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 2215, in _resolve_features
                  features = _infer_features_from_batch(self.with_format(None)._head())
                File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 1239, in _head
                  return _examples_to_batch(list(self.take(n)))
                File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 1388, in __iter__
                  for key, example in ex_iterable:
                File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 1044, in __iter__
                  yield from islice(self.ex_iterable, self.n)
                File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/iterable_dataset.py", line 282, in __iter__
                  for key, pa_table in self.generate_tables_fn(**self.kwargs):
                File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/packaged_modules/json/json.py", line 164, in _generate_tables
                  raise ValueError(f"Not able to read records in the JSON file at {file}.") from None
              ValueError: Not able to read records in the JSON file at hf://datasets/xxx/train.json.

在这里插入图片描述
查看了别人展示的数据展示都好好的,就下载了一个看一下
发现别人的json文件都是一行一个字典
而我的是一个列表包含了很多字典
所以我就开始想办法转换成他们那样

import json
# 读取 JSON 文件
with open('train.json', 'r', encoding='utf-8') as f:
    data = json.load(f)
# # 将每个字典写入到新文件中,每行一个字典
with open('train_new.json', 'w', encoding='utf-8') as f:
    for item in data:
        json_string = json.dumps(item, ensure_ascii=False)
        f.write(json_string + '\n')
print("处理完成")

然后我就在VSCode打开我的新文件,就出现了报错预期为文件结尾。json [行2,列1]

很低级的错误,我问了chatgtp4和claude都没有回答上来,百度搜索也搜不到
其实就是保存文件时应该是jsonl
最后dataset viewer 正常展示了
在这里插入图片描述

1,json要解析的文件 {"data":[{"name":"sam","age":18},{"name":"leo","age":19},{"name":"sky","age":20}]};  将该文件可放在raw中文件名是以json 结尾 2,在MainActivity中进解析 3,新建一个类,用于存放解析后的数据 package com.example.pulljiexi;
语法错误: 未预期文件结尾。 出现了此错误提示,进了如下的检查:1、检查Shell脚本的语法错误,更正之后再上传Linux系统下运,错误提示依旧;2、文件结尾删除空、添加空;错误提示依旧。3、把shell脚本中的内容直接在命令中执,没有问题。 最后的解决方法(简单有效): 从Linux环境下找了一个可以成功执的Shell脚本,下载到Windows环境下,更改文件名后直接在此
根据你提供的信息,可能是因为你指定的路径下确实没有以 .json 结尾文件。在 launch.json 或 tasks.json 中,你需要确保指定的文件名以 .json 结尾,并且该文件存在于指定的路径下。你可以检查一下文件名和路径是否正确,并确保文件存在于指定的位置。如果文件名或路径不正确,你需要更新配置文件中的相应字段,以指向正确的文件。 #### 引用[.reference_title] - *1* *2* *3* [cpp vocode launch.json 和 tasks.json](https://blog.csdn.net/MakeYouClimax/article/details/131197708)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^insert_down1,239^v3^insert_chatgpt"}} ] [.reference_item] [ .reference_list ]
Authorization not available. Check if polkit service is running or see debug message for more inform 【新方案】RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`