/aws-opensearch

Building Search Feature for Webapp Using Amazon OpenSearch

Primary LanguageTypeScriptApache License 2.0Apache-2.0

Searching Best Movies With AWS Opensearch

Searching Best Movies With AWS Opensearch

Abstract

  • OpenSearch is a community-driven, open source fork of Elasticsearch and Kibana following the licence change in early 2021.
  • This post goes overview of opensearch and make an demo of using opensearch with AWS Amplify for authenticate

Table Of Contents


๐Ÿš€ OpenSearch Overview

  • Master nodes are responsible for actions such as creating or deleting indices, deciding which shards should be allocated on which nodes, and maintaining the cluster state of all nodes. The cluster state includes information about which shards are on which node, index mappings, which nodes are in the cluster and other settings necessary for the cluster to operate. Even though these actions are not resource intensive, it is essential for cluster stability to ensure that the master nodes remain available at all times to carry out these tasks.

  • Although in small clusters it is possible to have master nodes which also carry out search and index operations (which is the default configuration), searching and indexing are both resource intensive, resulting in the node not having sufficient resources to carry out the master node tasks, ultimately resulting in cluster instability.

  • For this reason, once a cluster reaches a certain size it is highly recommended to create 3 dedicated master nodes in different availability zones. The master nodes require excellent connectivity with the rest of the nodes in the cluster and should be in the same network.

  • Use at least three nodes to avoid an unintentionally partitioned network (split brain). If the cluster has 2 only nodes, split-brain happens when there's communication network between them is interrupted. The both nodes think the other down and try to promote itself to become master. When the network communication fix, there're two masters and causes split-brain.

๐Ÿš€ Fine-grained access control

  • When enable Fine-grained access control in Amazon OpenSearch Service (not able to disable), following must be set
    • Node-to-node encryption

      • It provides an additional layer of security on top of the default features of Amazon OpenSearch Service.
      • Node-to-node encryption enables TLS 1.2 encryption for all communications within the VPC.
      • If you send data to OpenSearch Service over HTTPS, node-to-node encryption helps ensure that your data remains encrypted as OpenSearch distributes (and redistributes) it throughout the cluster. If data arrives unencrypted over HTTP, OpenSearch Service encrypts it after it reaches the cluster. You can require that all traffic to the domain arrive over HTTPS using the console, AWS CLI, or configuration API.
    • Encryption at rest - a security feature that helps prevent unauthorized access to your data. The feature encrypts the following aspects of a domain:

      • All indices (including those in UltraWarm storage)
      • OpenSearch logs
      • Swap files
      • All other data in the application directory
      • Automated snapshots
    • Enforce HTTPS

๐Ÿš€ Opensearch API Documents

  • Put a single data to opensearch domain
curl -XPUT -u 'os-master-user:HOl0q3$L' 'https://search-opensearch-demo-movie-63wtzcvz3uhgmhotaban5jr6gu.ap-southeast-1.es.amazonaws.com/movies/_doc/1' -d '{"director": "Burton, Tim", "genre": ["Comedy","Sci-Fi"], "year": 1996, "actor": ["Jack Nicholson","Pierce Brosnan","Sarah Jessica Parker"], "title": "Mars Attacks!"}' -H 'Content-Type: application/json'
{"_index":"movies","_type":"_doc","_id":"1","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"_seq_no":0,"_primary_term":1}
curl -XPOST -u 'os-master-user:Q2rmr7hFn@am' 'https://search-opensearch-demo-movie-63wtzcvz3uhgmhotaban5jr6gu.ap-southeast-1.es.amazonaws.com/_bulk' --data-binary @sample-movies.bulk -H 'Content-Type: application/json'
  • Search Document
curl -XGET -u 'os-master-user:Q2rmr7hFn@am' 'https://search-opensearch-demo-movie-63wtzcvz3uhgmhotaban5jr6gu.ap-southeast-1.es.amazonaws.com/movies/_search?q=mars&pretty=true'
{
  "took" : 485,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 0.2876821,
    "hits" : [
      {
        "_index" : "movies",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.2876821,
        "_source" : {
          "director" : "Burton, Tim",
          "genre" : [
            "Comedy",
            "Sci-Fi"
          ],
          "year" : 1996,
          "actor" : [
            "Jack Nicholson",
            "Pierce Brosnan",
            "Sarah Jessica Parker"
          ],
          "title" : "Mars Attacks!"
        }
      }
    ]
  }
}

๐Ÿš€ Opensearch Pricing

  • Instance Usage
  • Storage cost (EBS)
  • Standard AWS data transfer charges You need to pay standard AWS data transfer charges for the data transferred in and out of Amazon OpenSearch Service. You will not be charged for the data transfer between nodes within your Amazon OpenSearch Service domain.

References:


Creating infra services using Cloud development toolkit (CDK)

๐Ÿš€ Infra Overview

๐Ÿš€ Create Opensearch Domain

  • The domain includes
    • Cluster with 2 AZs and 4 nodes (no dedicated masters)
    • Enable Fine-grained access control
    • Version: OpenSearch 1.0
    • EBS: 10GB (SSD)
    • Access policy to allow es:* for user/iam role which is added to opensearch data security
    • Use secret manager to store master-user
opensearch-stack.ts

๐Ÿš€ Create Cognito userpool using Amplify

๐Ÿš€ Create lambda function to query opensearch domain

  • Pre-create:

    • log group with retention 1 day
    • IAM role with log group permission
    • Lambda function handler lambda-codes/app.js uses nodejs 14 which query opensearch domain using aws-sdk. Package the source code
    cd lambda-codes
    npm i .
    zip -r app.zip *
    
  • Lambda function stack

lambda-os.ts

๐Ÿš€ Create API Gateway using REST API with lamda integration

  • API GW stack
api-rest.ts

๐Ÿš€ Import data for indexing

  • Retrieve password of os-master-user from secret manager for indexing data
curl -XPOST -u 'os-master-user:naster-password' 'https://search-opensearch-demo-movie-63wtzcvz3uhgmhotaban5jr6gu.ap-southeast-1.es.amazonaws.com/_bulk' --data-binary @sample-movies.bulk -H 'Content-Type: application/json'

๐Ÿš€ Use React App To visualize the search result


๐ŸŒ  Blog ยท Github ยท stackoverflow ยท Linkedin ยท Group ยท Page ยท Twitter ๐ŸŒ