[Feature] Suggestions for better schema generation
lazyhope opened this issue · 1 comments
Thanks for the great work! I've been leveraging instructor for structured extraction and am excited about langchain's focus on this area. I'd like to propose a few enhancements to further improve the project:
- Enhanced Chat Interface: An upgraded chat interface for easier schema creation and editing. For example, supports uploading pdf/csv/xlsx files containing attributes to extract and style instructions (e.g., maxLength, pattern).
- Field Validation Support: Support generation of something similar to Pydantic's field validator, for more complex data constraints, enhancing the accuracy of extracted information. This might not be feasible in JSON schema generation, so another feature would be
- Pydantic model generation: Allowing the generation of Pydantic models for greater customization and flexibility in data handling and schema enforcement.
Given the impressive capabilities of language models in content generation based on schema constraints, focusing on advanced schema generation from natural language descriptions seems like a logical and impactful next step. These features could significantly streamline large-scale schema development.
Thank you for considering these suggestions. I look forward to potentially contributing to the project's growth.
Since the project seems not in active development right now I am leaving my own implementation here which completes most of the features above:
https://github.com/lazyhope/metamodel