Kotlin/dataframe

GroupBy.sort and sortBy vague error reporting

koperagen opened this issue · 0 comments

Given https://www.kaggle.com/datasets/ruchi798/data-science-job-salaries and

@DataSchema
interface DsSalaries {
    @ColumnName("company_location")
    val companyLocation: String
    @ColumnName("company_size")
    val companySize: String
    @ColumnName("employee_residence")
    val employeeResidence: String
    @ColumnName("employment_type")
    val employmentType: String
    @ColumnName("experience_level")
    val experienceLevel: String
    @ColumnName("job_title")
    val jobTitle: String
    @ColumnName("remote_ratio")
    val remoteRatio: Int
    val salary: Int
    @ColumnName("salary_currency")
    val salaryCurrency: String
    @ColumnName("salary_in_usd")
    val salaryInUsd: Int
    val untitled: Int
    @ColumnName("work_year")
    val workYear: Int
}

This doesn't fail
df.group { salaryInUsd }.into("group").groupBy { companyLocation }.sortBy(pathOf("group", "salaryInUsd")).print()
This does with "can't apply sort flag to column group":
df.group { salaryInUsd }.into("group").groupBy { companyLocation }.sortByDesc(pathOf("group", "salaryInUsd")).print()

In reality, both snippets are wrong and it should be sortBy(pathOf("group", "group", "salary_in_usd")) (which itself is confusing).
These functions should report that pathOf("group", "salaryInUsd") doesn't exist in both cases