curiosity-ai/catalyst

How to extract contextual information Names, pre-scription reading etc

Closed this issue · 5 comments

Language
English

Hello, I tried to modify your samples, but I am not getting meanigful information back from text documents.

For .e.g how can we get name / docotors notes , or medical stuff, address etc.

Hi @fasteddys - you need to train an entity recognition model in order to extract information like this. Do you have any training data for these entities?

hello @theolivenbaum , Yes I have data, but how do I setup or modify the sample please

hello @theolivenbaum , Yes I have data, but how do I setup or modify the sample please

I'm currently using this to train the NER model "TEST" and you may use it as example. However the NER does not seem to learn from the training data, so you might need to modify some part.

        public async Task TrainNlp(List<NerTrainData> trainData)
        {
            var aper = new AveragePerceptronEntityRecognizer(Language.English, 0, "TEST", new string[] { "Person", "Organization", "Location" }, ignoreCase: false);

            var trainingEntities = trainData.SelectMany(t => t.Associations.Select(a => a.Tag)).ToArray();
            //aper.AddEntityTypes(trainingEntities);
            var documents = new List<Document>();

            foreach (var data in trainData)
            {
                var sentence = new Document(data.Paragraph);
                var span = sentence.AddSpan(0, sentence.Length);
                foreach (var tag in data.Associations)
                {
                    var token = span.AddToken(tag.Start, tag.End);
                    token.AddEntityType(new EntityType(tag.Tag, EntityTag.Inside));
                }
                documents.Add(sentence);
            }
            aper.Train(documents);
            await aper.StoreAsync();
        }

Hello thanks for your code, but there some issue, I tried it out, the NER does not seem to learn from the training model data

@fasteddys Did you manage to solve this query?