Pinned Repositories
Dev_Notes
Useful notes for setting up servers and explaining a few concepts
EMR_Studio_Hudi
Apache Hudi examples designed to be run on AWS Elastic Map Reduce (EMR) via. EMR Studio or EMR Notebooks
Flink_Kinesis_Data_Analytics
Apache Flink examples designed to be run by AWS Kinesis Data Analytics (KDA).
OpenSearch_CloudWatch_Alarms
CloudFormation stack automating the deployment of recomended cloudwatch alarms for OpenSearch
OpenSearch_Dashboard_Nginx_Proxy
Access the OpenSearch dashboard of a domin deployed in a private subnet via. a Nginx Proxy
OpenSearch_Index_Shard_Size
Example covering ideal shard size + how to adjust # of primary, replicate shards for an index
OpenSearch_kNN_Vector_Search
Tokenize and convert sample text data into vectors using BERT. Load the vector representation of the text to OpenSearch and use kNN for semantic search
OpenSearch_Log_Analytics
Introduction workshop to log analytics on AWS OpenSearch
OpenSearch_Neural_Search
OpenSearch Neural Search example. Load BERT to OpenSearch and create embeddings as data is indexed. Use the embedding to preform vector search
OpenSearch_Resource_Flow_Chart
Flow chart to help you navigate the OpenSearch resources I have built
ev2900's Repositories
ev2900/OpenSearch_CloudWatch_Alarms
CloudFormation stack automating the deployment of recomended cloudwatch alarms for OpenSearch
ev2900/OpenSearch_Dashboard_Nginx_Proxy
Access the OpenSearch dashboard of a domin deployed in a private subnet via. a Nginx Proxy
ev2900/OpenSearch_Index_Shard_Size
Example covering ideal shard size + how to adjust # of primary, replicate shards for an index
ev2900/OpenSearch_Log_Analytics
Introduction workshop to log analytics on AWS OpenSearch
ev2900/OpenSearch_Resource_Flow_Chart
Flow chart to help you navigate the OpenSearch resources I have built
ev2900/EMR_Studio_Hudi
Apache Hudi examples designed to be run on AWS Elastic Map Reduce (EMR) via. EMR Studio or EMR Notebooks
ev2900/OpenSearch_Neural_Search
OpenSearch Neural Search example. Load BERT to OpenSearch and create embeddings as data is indexed. Use the embedding to preform vector search
ev2900/OpenSearch_Refresh_Interval
Example covering how to adjust the refresh interval on an OpenSearch index
ev2900/OpenSearch_kNN_Vector_Search
Tokenize and convert sample text data into vectors using BERT. Load the vector representation of the text to OpenSearch and use kNN for semantic search
ev2900/EMR_Studio_Iceberg
Apache Icebery examples designed to be run on AWS Elastic Map Reduce (EMR) via. EMR Studio or EMR Notebooks
ev2900/Iceberg_EMR_Athena
Resources from an virtual tech talk / workshop - Set Up and Use Apache Iceberg Tables on Your Data Lake
ev2900/Iceberg_Glue_from_JARs
Configure any version of Apache Iceberg with AWS Glue by installing Iceberg from JAR files
ev2900/Iceberg_update_metadata_script
Python script that will update S3 file paths in Iceberg metadata files (metadata.json + AVRO)
ev2900/OpenSearch_Sigv4_IAM_Auth
Authenticate with OpenSearch via. IAM Sigv4
ev2900/Outlook_MSG_Parser_Python
Python script to process emails saved with .msg file extension
ev2900/SecurityLake_AmazonSecurityLakeMetaStoreManager
CloudFormation to automate the deployment of the required IAM roles for AWS Security Lake
ev2900/OpenSearch_API_Examples
Example API calls to set up OpenSearch (via. Python) for anomaly detection, cross cluster replication, load sample data ...
ev2900/Bedrock_Examples
Example scripts to help you get started with Amazon Bedrock. Most of the examples are python scripts, jupyter notebooks and streamlit application
ev2900/BM25_Search_Example
Example to help understand how the BM25 term based ranking model works in search applications
ev2900/Cosine_Similarity_Search_Example
Example to help understand how to use hugging face sentance_transformers to encode searchable text into embeddings AND how to use cosine similarity search to determine similarity between a search prompt and the embeddings
ev2900/Glue_Aggregate_Small_Files
PySpark script to aggregate small parquet files in a prefix into larger files. Designed to be run on AWS Glue
ev2900/Iceberg_Glue_register_table
Example using the Iceberg register_table command with AWS Glue and Glue Data Catalog
ev2900/Logstash_Example
Logstash log collection, parsing example
ev2900/Managed_Streaming_for_Apache_Kafka_Examples
Code samples for various topic on & related to Managed Streaming for Apache Kafka (MSK)
ev2900/MongoDB_Streams_Glue_Iceberg
Process DynamoDB change streams via. AWS Glue w Iceberg to keep a copy of a collection in S3 upto date
ev2900/OpenSearch_Audit_Logs
Breif explanation of REST layer audit logs for Amazon OpenSearch Service
ev2900/OpenSearch_DeletedDocuments
Examples explaining how deletes work in OpenSearch
ev2900/OpenSearch_Local_Dashboard_Server
Connect a locally hosted OpenSearch dashboard server to an Amazon OpenSearch hosted domain
ev2900/OpenSearch_Read_Only_Index
Example covering how to set an OpenSearch index to read only. A common prerequisite for performance tuning tasks
ev2900/OpenSearch_User_Role_Premission_Managment
Example python code manipulating the OpenSearch RESTful API for user, role and permission management