Masabeeh is an Arabic search engine that uses advanced natural language processing tools to provide comprehensive and accurate Arabic web search results. It powers the world’s largest Arabic website, Al Jazeera Net news portal. Masabeeh’s is highly customizable and it can mine any amount of Arabic content then index it and provide fast search results. It is based on industry-standard open source components like Nutch and Solr.
The following are some of Masabeeh features:
- Simple search - Custom /advanced search - Modular, Plug-in based architecture - Scheduling crawler - URLs filters - Arabic pages filter - Parse plain text and files - Relevancy Ranking - Web graph scoring - Highlighting - More from site - Result pagination - Using logical operators (AND, OR, NOT) - Search in terms related to the search words linguistically - Alerts feature - Personalization - Meta tags parsing - Exclude part of page from index - Faceted search - Several caches for faster query responses - Replication and dynamic clustering support - A simple web based administration GUI to configure the system easily - Logging the user search activities, to be used in analysis of the system. - Search results ranking customization, to order results on custom requirements - Multi view for the results using multiple XSL - Return the result as XML to be used by a 3rd party application easily - Ability to add more than one index and make each index has specific features Arabic language tools: - Stemmer - Rooter - Spellchecker - Search with derivatives - Search with synonymous and thesaurus - Arabic dictionaries - Auto Linking - Auto complete - Finding related articles - Auto suggestion