Azure/azure-search-vector-samples

Failing to Implement Custom Skillset [Calls Azure Function]

Opened this issue · 1 comments

Using the "Azure AI Search Integrated Vectorization Sample," I am attempting to create an indexer to execute a predefined skillset involving SplitSkill and AzureOpenAIEmbeddingSkill. However, I need to implement a custom skillset involving WebApiSkill that calls an Azure Function.

I'm encountering an issue when trying to import the SearchIndexerSkill from the azure.search.documents.indexes.models module. The specific error is not mentioned in the provided details.

I am using the library azure_search_documents-11.4.0b12-py3-none-any.whl as specified in the sample.

If you have a specific error message or issue that you are facing, please provide more details for a more targeted assistance.

@AniDD

If you want a code sample to create a custom skillset, please review

Copying relevant portion here for convenience:

# Create a skillset  
skillset_name = f"{index_name}-skillset"  
  
split_skill = SplitSkill(  
    description="Split skill to chunk documents",  
    text_split_mode="pages",  
    context="/document",  
    maximum_page_length=300,  
    page_overlap_length=20,  
    inputs=[  
        InputFieldMappingEntry(name="text", source="/document/content"),  
    ],  
    outputs=[  
        OutputFieldMappingEntry(name="textItems", target_name="pages")  
    ],  
)  
  
embedding_skill = WebApiSkill(  
    description="Skill to generate embeddings via a custom endpoint",  
    context="/document/pages/*",
    uri=custom_vectorizer_endpoint, 
    inputs=[
        InputFieldMappingEntry(name="text", source="/document/pages/*"),  
    ],  
    outputs=[  
        OutputFieldMappingEntry(name="vector", target_name="vector")  
    ],
)  
  
index_projections = SearchIndexerIndexProjections(  
    selectors=[  
        SearchIndexerIndexProjectionSelector(  
            target_index_name=index_name,  
            parent_key_field_name="parent_id",  
            source_context="/document/pages/*",  
            mappings=[  
                InputFieldMappingEntry(name="chunk", source="/document/pages/*"),  
                InputFieldMappingEntry(name="vector", source="/document/pages/*/vector"),  
                InputFieldMappingEntry(name="title", source="/document/metadata_storage_name"),  
            ],  
        ),  
    ],  
    parameters=SearchIndexerIndexProjectionsParameters(  
        projection_mode=IndexProjectionMode.SKIP_INDEXING_PARENT_DOCUMENTS  
    ),  
)  
  
skillset = SearchIndexerSkillset(  
    name=skillset_name,  
    description="Skillset to chunk documents and generating embeddings",  
    skills=[split_skill, embedding_skill],  
    index_projections=index_projections,  
)  
  
client = SearchIndexerClient(service_endpoint, AzureKeyCredential(key))  
client.create_or_update_skillset(skillset)  
print(f"{skillset.name} created")  

I hope this helps,
Matt