Building & Training a Risk Model
Risk Model Overview
A risk model is a type of machine learning model that predicts a “risk” score for a given outcome. Outcomes can be of type occurrence or regression.
Creating a Risk Model
- Navigate to the folder where you want your model to be stored
- Click the “+” icon located to the right of “Models”
- The following box will pop-up:
- Type: Select "Risk"
- Define Model and Dataset
– Build in Curia App: Select if you want to define an outcome, intervention, cohort, and time period in the app (from which Curia will generate the model population and features)
– Upload pre-compiled data: If you have a completely custom training dataset (including features) that you wish to use
- Enter a name for your model
- (Optional) Write a description of your model
- Click the "Create Model" button, which will open a new page
Building in the Curia App
Outcome and Intervention
You will first be prompted to select an Outcome and Intervention
Outcome
-
Select Outcome
– The Point & Click workflow allows you to define diagnosis, procedure, .... outcomes by selecting specific codes
– To model a custom outcome, select “Custom Outcome” and choose the outcome dataset you wish to use. -
Select Outcome Type:
– To predict the likelihood a given binary event occurs, select Occurrence
– To predict a continuous outcome, select Regression
-- After selecting Regression, you must choose an aggregation type.
-- Note that all aggregation types other than "count" will use event cost as the value to aggregate
Generate Cohorts
-
Set Evidence: The period of time that the covariates (features) for training are built on. This is always 12 months.
-
Set Delays:
- For a risk model, data delay + pre-outcome delay are added together to produce a total delay indicating the gap between the end of the evidence period and the beginning of an outcome measurement.
- Data Delay is used to indicate any delays related to data availability; e.g. a 90 day waiting period for claims to be processed would result in a 3 month data delay.
- Pre-outcome delay: The period of time to wait before measuring an outcome. E.g. if a known treatment such as a medication takes 1 month to produce any results, the pre-outcome delay should be 1 month.
- Outcome: The period of time that information is aggregated over to generate the modeling outcomes
Window
– Rolling (Multiple Cohorts): Select if you wish to create more data from a rolling window of several different incremental time periods. Then select the date when the evidence period starts and the date you want your outcome period to end across your windows.
– Fixed (Single Cohort): If you’re certain about your exact start and end dates for the evidence and outcome periods.
Require full outcome period data for individuals
– It is often the case that some individuals will die before the end of the outcome period (or switch to a new healthcare organization), meaning that if we are modeling on the occurrence of a specific code, this individual could have had this code, but we ran out of information on them since they aren’t in our dataset anymore.
– Checking this option counteracts the above by ensuring only individuals with data after the outcome period ends are included
Population
– New Code Filter: Add a filter that either only includes or excludes patients with specific code data in our modeling analysis
– New Demographic Filter: add a filter that either only includes or excludes patients that share some specific demographic information in our modeling analysis
– Select Dataset: This option allows you to define the cohort using a dataset. To do this, the cohort dataset must already have been uploaded to the platform. For information on this, see the Datasets Guide
-
Once you have configured all of these elements, click "Preview Model Data" to run queries that generate the relevant dataset and output summary statistics
-
Hit "Train Model"
- View status on progress bar
- Any errors will show up
- See Interpreting Risk Model Results
Updated over 1 year ago