てくすた

ピクスタ株式会社のエンジニア・デザイナーがつづるよもやまテクニカルブログです

Some basic tips to start with search engine cloud

PIXTA Advent Calendar 2017 Day 12.

Search is an important function in any system. Here are some basic tips but important from our experience.

1. Separate JA and Overseas Amazon CloudSearch

Beginning of this year, we investigated to add new language into our search engine We found out that we can improve the way we use to reduce cost. At that time We use only one domain to host all languages (Ex: JA, EN, TH, zh-TW, zh-CN, KO) but the traffic for JA is much more than all other languages.

So that we come up with an idea to separate JA from others (EN, TH, zh-TW, zh-CN, KO), then We can reduce storage for JA (reduce instance size or partition) and reduce the number of instances for others.

With this simple action, we can reduce cost 20->30%

2. Apply Elastic Cloud

We have many options to choose search engine services, each has its own advantage and disadvantage. In our experience, even We don't operate server but We always want to monitor and understand what it's going on, it really helpful information for us to adjust configuration to meet our demand. Like Amazon CloudSearch it does not allow us to see what happen inside the server, Sometimes Amazon CloudSearch is down or runs slowly but we don't have the tools to figure out the problem. In contrast, Elastic Cloud provides us with many metrics: CPU, Memory, Storage, etc. So it makes us more confident to manage search engine service. Currently we just try to us it for one language and it works every well. We will continue to move move languages to Elastic Cloud.

f:id:hamuyuuki:20171212123745p:plain

3. Scale to import into search engine

Both AWS CloudSearch and Elastic Cloud support bulk import and They are automatically scale to meet your demand in import request. So If We have many data (millions), We should use bulk import and run it in parallel to reduce importation time. in our case, We sometimes run up to 100 threads and it still works well.


About Duan(ズァン): He is one of the first 2 engineers joining Pixta Vietnam as Technical Lead. He had been working as a Java engineer, before joined to Pixta to keep challenging himself with new things and having chances to work on different technologies and businesses.

About Development team: Development team is the biggest team in PIXTA’s Hanoi office. We focus on implementing new features and maintain current functions, especially focus on search, payment, contributor functions. Our goal is to make our platform better everyday by applying new technologies and improving user experience.


www.wantedly.com

www.wantedly.com