sriniiyer/codenn

Broken data format

Closed this issue · 2 comments

There is much broken data starting from line 2353 of csharp/train.txt(https://raw.githubusercontent.com/sriniiyer/codenn/master/data/stackoverflow/csharp/train.txt

), like

29768982	29768200	C# sorting multidimensional array by multiple columns	using System;
\nusing System.Collections.Generic;
\nusing System.Linq;
\nusing System.Text;
\n
\nnamespace ConsoleApplication19
\n{
\n    class Program
\n    {
\n        static void Main(string[] args)
\n        {
\n            List<List<int>> multiarray = new List<List<int>>{    
\n                new List<int> { 8, 63  },
\n                new List<int>  { 4, 2   }, 
\n                new List<int>  { 0, -55 }, 
\n                new List<int>  { 8, 57  }, 
\n                new List<int>  { 2, -120}, 
\n                new List<int>  { 8, 53  }  
\n            };
\n           
\n
\n            List<List<int>> sortedList = multiarray.OrderBy(x => x[1]).OrderBy(y => y[0]).ToList();
\n
\n        }
\n    }
\n}	0

Are you using a windows based system? It's a carriage return character, which is ignored by *nix systems. You can just replace all carriage returns in the file and it should be ok.

Thanks.