Fenotyper

Fenotyper is an artificial intelligence (AI) based software platform that generates clinical phenotype definitions from plain-language descriptions, leveraging Large Language Models (LLM) and Retrieval-Augmented Generation (RAG). The system converts a researcher’s or clinician’s description of a patient population of interest into a structured definition that can be applied to electronic health record (EHR) data. The platform combines human input with AI reasoning to identify relevant medical concepts and organize them into inclusion and exclusion criteria, allowing experts to review and refine the results. By reducing manual coding and improving clarity, Fenotyper supports faster, more consistent identification of patient cohorts for research and healthcare improvement while helping translate large health datasets into usable information for decision making.

Background: 
Clinical and translational research depends on identifying patient groups that share specific medical characteristics, often referred to as phenotypes. Creating these definitions requires experts to review clinical terminology, write rules, and validate outputs. The process is time consuming and difficult to reproduce across institutions. Many approaches rely on custom programming, specialized informatics teams, or manual rule authoring, so research efforts often spend substantial time preparing datasets before analysis. In addition, definitions developed by different groups may vary, limiting reproducibility and comparability across studies.

Fenotyper addresses these challenges by allowing users to describe a patient population in natural language while the system produces a computable definition that can be applied to EHR systems. The platform provides traceable logic and editable outputs, enabling experts to validate and refine results rather than build them manually. This improves reproducibility, reduces preparation time, and supports large-scale clinical research and quality improvement activities.

Applications: 

  • Clinical and translational research, quality improvement, and performance monitoring
  • Population health analytics, public health studies, and health system reporting
  • Healthcare data science, biomedical informatics, and decision support
  • Pharmaceutical research and clinical trial cohort identification


Advantages: 

  • Converts plain-language clinical questions into computable definitions, reducing manual coding and rule creation
  • Provides readable reasoning with expert review and refinement for auditing
  • Improves consistency across institutions and accelerates patient cohort identification
  • Works within EHR environments and allows non-programmers to work with complex clinical data
  • Supports faster large-scale biomedical research and operational analytics
Patent Information: