GroupBy.sort and sortBy vague error reporting
koperagen opened this issue · 0 comments
koperagen commented
Given https://www.kaggle.com/datasets/ruchi798/data-science-job-salaries and
@DataSchema
interface DsSalaries {
@ColumnName("company_location")
val companyLocation: String
@ColumnName("company_size")
val companySize: String
@ColumnName("employee_residence")
val employeeResidence: String
@ColumnName("employment_type")
val employmentType: String
@ColumnName("experience_level")
val experienceLevel: String
@ColumnName("job_title")
val jobTitle: String
@ColumnName("remote_ratio")
val remoteRatio: Int
val salary: Int
@ColumnName("salary_currency")
val salaryCurrency: String
@ColumnName("salary_in_usd")
val salaryInUsd: Int
val untitled: Int
@ColumnName("work_year")
val workYear: Int
}
This doesn't fail
df.group { salaryInUsd }.into("group").groupBy { companyLocation }.sortBy(pathOf("group", "salaryInUsd")).print()
This does with "can't apply sort flag to column group":
df.group { salaryInUsd }.into("group").groupBy { companyLocation }.sortByDesc(pathOf("group", "salaryInUsd")).print()
In reality, both snippets are wrong and it should be sortBy(pathOf("group", "group", "salary_in_usd"))
(which itself is confusing).
These functions should report that pathOf("group", "salaryInUsd") doesn't exist in both cases