imgcook/datacook

Todo

Closed this issue ยท 20 comments

This will be a thread for discussion about what we should cover in the future.

IMO, we should have four major aspects:

  • Tabular
  • Text
  • Image
  • Sound

And we should implement some common features like dataset splitting, shuffle, data augmentation, visualizing?

Besides, data-type specific operations are also needed:

  • Generic
    • Train test split
    • Random sampler
    • Scalers (Standard, Robust, MinMax)
  • Image
    • Basic I/O
    • Filter
    • Crop
    • Resize
    • Rotating
    • Augmentation
    • color_mode
    • Normalize
    • (Some more ideas)
  • Text
    • Embedding
    • Serialize
    • stats analysis
    • Tokenization
    • Stemming
    • Stopword removal
    • Padding
  • Soung
    • FFT?
  • Tabular Data
    • Encodings
    • I/O

Discussions are welcome here!

Typescript or vanilla Javascript?

Typescript or vanilla Javascript?

Implement in TypeScript?

yes

Since danfo has already covered some basic interactions with table data, shall we firstly support more types of data first? I.E. Image data first?

/cc @steveoni @risenW

Since danfo has already covered some basic interactions with table data, shall we firstly support more types of data first? I.E. Image data first?

/cc @steveoni @risenW

Or we could create a base app that integrates with danfo just for the sake of having an API to test.

@WenheLI Do we really need visualization?

IMO, the visualization could be achieved by other JavaScript libs in the community.

@WenheLI Do we really need visualization?

Aye, I think visualization is kinda out of this repo's scope!

This will be a thread for discussion about what we should cover in the future.

IMO, we should have four major aspects:

  • Tabular
  • Text
  • Image
  • Sound

And we should implement some common features like dataset splitting, shuffle, data augmentation, visualizing?

Besides, data-type specific operations are also needed:

  • Generic

    • Train test split
    • Random sampler
    • Scalers (Standard, Robust, MinMax)
  • Image

    • Basic I/O
    • Filter
    • Crop
    • Resize
    • Rotating
    • Augmentation
    • color_mode
    • Normalize
    • (Some more ideas)
  • Text

    • Embedding
    • Serialize
    • stats analysis
    • Tokenization
    • Stemming
    • Stopword removal
    • Padding
  • Soung

    • FFT?
  • Tabular Data

    • Encodings
    • I/O

Discussions are welcome here!

@WenheLI @steveoni I added some updates to this

@risenW Looks solid! I think we have covered the most common methods for data processing. I think for now we can just follow this todo list and implement them step by step.
IMO, for the first step, we could wrap danfo and export some tabular related data-processing methods. And then move to image processing and some generic processing methods.

/cc @yorkie @steveoni @risenW

@risenW Looks solid! I think we have covered the most common methods for data processing. I think for now we can just follow this todo list and implement them step by step.
IMO, for the first step, we could wrap danfo and export some tabular related data-processing methods. And then move to image processing and some generic processing methods.

/cc @yorkie @steveoni @risenW

Yes, let's work with this. So I think the next step will be to define some code structure/styling and guide. Then we can create a basic skaffold application. @WenheLI , do you have style guide, or should I quickly draft one?

Yes, let's work with this. So I think the next step will be to define some code structure/styling and guide. Then we can create a basic skaffold application. @WenheLI , do you have style guide, or should I quickly draft one?

@risenW Sure! Let's get things on track! It's kinda late at my time now. If it is possible, could you write a draft one. And we can review it later!

Yes, let's work with this. So I think the next step will be to define some code structure/styling and guide. Then we can create a basic skaffold application. @WenheLI , do you have style guide, or should I quickly draft one?

@risenW Sure! Let's get things on track! It's kinda late at my time now. If it is possible, could you write a draft one. And we can review it later!

Okay, I'll draft one later today.

@yorkie @steveoni @WenheLI I created the base project skaffold, and also added a contributing guide. We can discuss what should be kept/discarded in the guide, as it is not final.

@risenW - This looks terrific!! The Contributing Guide looks perfect! I will add more details in the scaffold and then we can turn this repo to public! This looks promising!

@risenW Just update the codebase with ts. If this is okay with you, shall we turn this to public?

@risenW Just update the codebase with ts. If this is okay with you, shall we turn this to public?

Looks great! Yea we can make it public. Quick question, are we using Typescript or Vanilla JS?

@risenW Just update the codebase with ts. If this is okay with you, shall we turn this to public?

Looks great! Yea we can make it public. Quick question, are we using Typescript or Vanilla JS?

Definitely TypeScript. It makes life easier!

@risenW Just update the codebase with ts. If this is okay with you, shall we turn this to public?

Looks great! Yea we can make it public. Quick question, are we using Typescript or Vanilla JS?

Definitely TypeScript. It makes life easier!

Okay, noted ๐Ÿ˜

Just removed because this is stale and almost done.