{"id":1897,"date":"2018-07-26T13:47:23","date_gmt":"2018-07-26T13:47:23","guid":{"rendered":"http:\/\/capture.ccio.us\/?p=1897"},"modified":"2018-08-07T17:48:21","modified_gmt":"2018-08-07T17:48:21","slug":"predictive-research-engine-architecture","status":"publish","type":"post","link":"https:\/\/capture.club\/portal\/2018\/07\/26\/predictive-research-engine-architecture\/","title":{"rendered":"Capture Club Predictive Research Engine Architecture"},"content":{"rendered":"<body><p><\/p>We are NOT a Search Engine.\u00a0 We are a <strong>Research Engine<\/strong>.\n<p>Over the years, we have developed a fast, robust, resource-friendly, distributed framework for handling near-real-time Machine Learning and Predictive Analytics.<\/p>\n<p>From a high-level, our architecture looks like this:<\/p>\n<figure id=\"attachment_1985\" aria-describedby=\"caption-attachment-1985\" style=\"width: 300px\" class=\"wp-caption aligncenter\"><img decoding=\"async\" class=\"wp-image-1985 size-medium\" src=\"http:\/\/capture.ccio.us\/wp-content\/uploads\/2018\/07\/Screenshot-2018-08-04-06.29.26-300x169.png\" alt=\"Capture Club Predictive Research Engine Architecture. \" width=\"300\" height=\"169\" loading=\"lazy\"><figcaption id=\"caption-attachment-1985\" class=\"wp-caption-text\">This is our architecture for the Predictive Research Engine.<\/figcaption><\/figure>\n<p>External datasources (E.g. websites, documents, file systems, etc) are ingested using our proprietary Indexing Engine, and either persisted into <a href=\"http:\/\/lucene.apache.org\/solr\/\">Solr<\/a> or used for real-time <a href=\"https:\/\/en.wikipedia.org\/wiki\/Machine_learning\">Machine Learning<\/a> by <a href=\"http:\/\/capture.ccio.us\/memefinder-no-dictionaries-needed\/\">EMPATHS<\/a>. Once an inverted index is constructed form the ingested data, <a href=\"http:\/\/capture.ccio.us\/haidie-heuristic-analytics-integrated-data-interpretation-engine\/\">HAIDIE<\/a> goes to work building out dynamic <a href=\"https:\/\/www.quora.com\/What-does-it-mean-by-Classifier-in-Artificial-Intelligence-and-Machine-Learning\">classifiers<\/a>. These classifiers are essentially nano-applications that have a topic, and a set of associations that are constantly re-evaluated for relevancy and priority.\u00a0 When a query or query profile is entered into the system, HAIDIE goes to work and provides near-real-time Machine Learning, responding to the <a href=\"http:\/\/capture.ccio.us\/zen\/\">ZEN<\/a> interface via the <a href=\"http:\/\/capture.ccio.us\/tame-truly-asynchronous-messaging-environment\/\">TAME<\/a> protocol.\u00a0 \u00a0This allows for an on-going conversation, and allows the system to learn from the user.<\/p>\n<p>Some of the most innovative aspects of our research engine are:<\/p>\n<ul>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Runs on an average AWS micro Linux instance. <\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Consumes a mere 250 MB on average when running, yet still performs like a typical search engine. \u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Capable of scaling out to a distributed platform to handle more intensive research requests. \u00a0<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Capable of communicating with other registered tasks to distribute research tasks. <\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Local database is optional.<\/span><\/li>\n<\/ul>\n<p><\/p>\n<\/body>","protected":false},"excerpt":{"rendered":"<p>We are NOT a Search Engine.\u00a0 We are a Research Engine. Over the years, we have developed a fast, robust, resource-friendly, distributed framework for handling near-real-time Machine Learning and Predictive Analytics. From a high-level, our architecture looks like this: External datasources (E.g. websites, documents, file systems, etc) are ingested using our proprietary Indexing Engine, and [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"pagelayer_contact_templates":[],"_pagelayer_content":"","footnotes":""},"categories":[],"tags":[],"class_list":["post-1897","post","type-post","status-publish","format-standard","hentry"],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/capture.club\/portal\/wp-json\/wp\/v2\/posts\/1897","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/capture.club\/portal\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/capture.club\/portal\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/capture.club\/portal\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/capture.club\/portal\/wp-json\/wp\/v2\/comments?post=1897"}],"version-history":[{"count":0,"href":"https:\/\/capture.club\/portal\/wp-json\/wp\/v2\/posts\/1897\/revisions"}],"wp:attachment":[{"href":"https:\/\/capture.club\/portal\/wp-json\/wp\/v2\/media?parent=1897"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/capture.club\/portal\/wp-json\/wp\/v2\/categories?post=1897"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/capture.club\/portal\/wp-json\/wp\/v2\/tags?post=1897"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}