CSV to JSON does not recognize line breaks in Windows 7
c3piio opened this issue · 5 comments
I have a very simple CSV that Access 2010 created. When I convert it to JSON the resulting text does not have any line breaks. I tried changing line_sep to:
"line_sep": false
"line_sep": "false"
"line_sep": "\r"
None of these recognize the line breaks in the CSV. A snapshot of the CSV and the first three rows are below:
"geonameid","name","asciiname","alternatenames","loc","feature_class","feature_code","country_code","cc2","admin1_code","admin2_code","admin3_code","admin4_code"
3,"Zamīn Sūkhteh","Zamin Sukhteh","Zamin Sukhteh,Zamīn Sūkhteh","[48.91667,32.48333]","P","PPL","IR",,"15",,,
5,"Yekāhī","Yekahi","Yekahi,Yekāhī","[48.9,32.5]","P","PPL","IR",,"15",,,
Can you paste in the result when you try to convert that snapshot?
What is the result if you go to the console and type "import os; os.linesep"?
After converting to JSON it looks like this:
[{""name"": "Zam\u012bn S\u016bkhteh", ""alternatenames"": "Zamin Sukhteh,Zam\u012bn S\u016bkhteh", ""cc2"": "", ""admin1_code"": "15", ""geonameid"": "3", ""feature_class"": "P", ""country_code"": "IR", ""loc"": "[48.91667,32.48333]", ""feature_code"": "PPL", ""admin3_code"": "", ""admin2_code"": "", ""admin4_code"": "", ""asciiname"": "Zamin Sukhteh"}, {""name"": "Yek\u0101h\u012b", ""alternatenames"": "Yekahi,Yek\u0101h\u012b", ""cc2"": "", ""admin1_code"": "15", ""geonameid"": "5", ""feature_class"": "P", ""country_code"": "IR", ""loc"": "[48.9,32.5]", ""feature_code"": "PPL", ""admin3_code"": "", ""admin2_code"": "", ""admin4_code"": "", ""asciiname"": "Yekahi"} .... etc ....
Import OS:
import os; os.linesep
'\r\n'
I can reproduce this. While the quotes are getting doubled, it's valid JSON. Fixing the quotes should be a quick fix. In the meantime, try doing a regex find-replace for "\\"|\\""
to "
.
Here's what I get when I do that find-replace and then format the JSON:
[
{
"name": "Zamīn Sūkhteh",
"alternatenames": "Zamin Sukhteh,Zamīn Sūkhteh",
"cc2": "",
"admin1_code": "15",
"geonameid": "3",
"feature_class": "P",
"country_code": "IR",
"loc": "[48.91667,32.48333]",
"feature_code": "PPL",
"admin3_code": "",
"admin2_code": "",
"admin4_code": "",
"asciiname": "Zamin Sukhteh"
},
{
"name": "Yekāhī",
"alternatenames": "Yekahi,Yekāhī",
"cc2": "",
"admin1_code": "15",
"geonameid": "5",
"feature_class": "P",
"country_code": "IR",
"loc": "[48.9,32.5]",
"feature_code": "PPL",
"admin3_code": "",
"admin2_code": "",
"admin4_code": "",
"asciiname": "Yekahi"
}
]
Having the same issue, it seems like the line endings detection is messed up.
I guess it's a python thing ? http://bugs.python.org/issue18829
FWIW, I had a similar issue today, where when trying to do any conversion, it simply returned an "empty" result (eg. converting to JSON returned [], even using the sample data that is on the README).
The solution I found in Sublime text was to go to View > Line Endings > Unix, then the conversion worked.
I'm running Sublime Text 3, with Windows 10, too.