SBIR Phase II: Linguistic Analysis of Web Content for 21st Century Inquiry Learning

Period of Performance: 02/18/2015 - 02/28/2017


Phase 2 SBIR

Recipient Firm

462 Ballytore Rd
Wynnewood, PA 19096
Firm POC, Principal Investigator


This Small Business Innovation Research Phase II project engages K-12 students in inquiry-based learning not only in STEM education but across the curriculum. The coming of the information age has mandated radical changes in the skills set required for 21st century. To teach these skills, as articulated in the Common Core Standards, teachers are expected to target students? ability to conduct short as well as more sustained research projects. The Internet is not only the obvious go-to place for teaching these skills but one of the reasons for mandating them. It is the easy access to vast amounts of information that has made it possible for everyone to conduct research, analyze data and build new knowledge. The problem, however, is that the Internet is not a library with reliable and catalogued information that can be searched by topic and, in school libraries, by reading level. The development of the proposed technology, a personalized librarian on-demand, solves this problem. It offers unprecedented opportunity for learners of all ages to engage in inquiry, it enables differentiated learning and gives learners a life-long tool to help them understand and critically engage with digital information. The target market is not limited to the U.S. The developed linguistic algorithms apply to non-English, allowing early targeting of the international market by building web curators for other languages. This project blends research on linguistics, psychology of reading, natural language processing (NLP) and machine learning. Innovative NLP technology is used to detect age-appropriate sites that are relevant to the curriculum. The developed technology analyzes the reading difficulty and thematic content of websites automatically in real time search. The purpose of the system is to make adaptive individualized recommendations that match the student's reading level and are appropriate for her current familiarity with the topic. The patent pending adaptive component is built using machine learning and collaborative filtering methods. The resulting research tool, a personalized web curator, is based on novel methods for modeling the student's familiarity with the topic of web resources. These methods are extended to include novel analysis of video content. Unlike other adaptive educational products, the system's adaptive recommendations do not rely on obtaining test scores from the user and corresponding them to pre-labeled, leveled data. The complete solution includes an integrated gamified experience that helps the students acquire basic and advanced research skills and develop the habit of applying them consistently. The project team includes educational technology experts from academia and schools and conducts extensive studies evaluating learning benefits by the afforded differentiation in instruction, including special education programs and English Language Learners.