Model Building 103

 

In this tutorial, we will put our B-cell model to the test and use it to analyze a real, B-cell sample file. We will start with the model we created in Model Building 101 and 102, and we will work through the process of adjusting the model to a real sample. At that point, we should be able to select the additional parameters that are in our sample file and see how they behave in the B-cell progression – without having to model them ourselves.

At this stage, we are jumping out of our model development cycle and into the real world of analyzing real samples.

 

images\modelbuildingflowchart.gif

 

Open the Model

On the main toolbar, click the Open Document button. Select “Model102.gs” and click the Open button to load our working model.

 

Disable the Profiles

Normally, when GemStone opens an FCS data file, it automatically analyzes the file with the model it currently has open. In making the transition from synthesized data to real data, we don’t want this to happen because we want to see the unanalyzed data from our real sample. So we’re going to want to turn off the pieces of the model. We can do this with one action.

Click the Utility Action button on the Workspace toolbar, and select Disable Profiles from the menu.

 

images\model103resetflags.gif

 

All of our plots will redraw without color, showing all events in an unanalyzed state.

Model Building Tip: Use Disable Profiles

Disable Profiles is used to turn off the profiles for all parameters in the cell type. This can be useful if you want to restart the model building process but don't want to reconstruct the model from scratch.

 

Open the Data File

On the main toolbar, click the Select FCS Files button. Navigate to GemStone’s Sample Files folder, select BCellLineage.fcs, and click Open. The file is added to the File Database and read into our model.

Now we will enable the pieces of our model, one at a time, starting with the CD19 and SSC selection parameters. Selection parameters, as we learned in Model Building 101, are those that have constant parameter profiles and are used to select populations of interest. In this case, CD19 and SSC will be used to select B-cells from our sample file.

 

Adjust and Enable CD19

Click the shrink tool for the CD19 Parameter Profile plot to expand the plot.

 

images\model103cd19shrinktool.gif

 

The B-cells are those with high expression of CD19. We need to move the CD19 Control Points down to identify the correct cells. Click and drag one of the control points to identify the band of CD19 positive cells.

We also need to adjust the line spread to encompass the bright CD19 events. Use the Line-spread slider or mouse scroll wheel to adjust the slider to select a broad band of bright CD19 events.

 

images\model103cd19.gif

 

You might worry that we are being too generous in selecting with CD19 here. That is intentional at this point: we don’t want to exclude dim CD19 events yet. As we adjust the model, those dim events will start to drop out.

Click the Enable tool on the CD19 parameter profile to turn it on images\matchstatuson.gif , and then click the Classify Data button on the main toolbar images\classifydatabutton.gif .

 

Turn Off Means and Confidence Limits for SSC

We’re ready to move on to SSC. Click the shrink tool for the SSC Parameter Profile plot to expand the plot.

GemStone is trying to be helpful by showing 95% confidence limits and the mean of the analyzed events. In this case, however, the extra graphics are making it difficult to see what we need to see. Let’s turn off the extra graphics.

Right-click in the SSC plot area and choose Edit Graphics from the context menu.

 

images\model103editgraphics.gif

 

In the Edit Graphic Options dialog, select the Means tab. Uncheck the Enable Means option.

Select the Conf Limits tab and change Line Spread Mode to Model SD Bars.

Select the Model tab and uncheck Enable Model.

Click OK. The SSC plot should now display only the dots and the control points.

 

Adjust and Enable SSC

Slide the Control Points for SSC up, centering the points on the low-intensity band of events. Adjust the line-spread for the profile. The result should look something like this:

 

images\model103sscadjusted.gif

 

Click the Enable tool on the SSC parameter profile to turn it on images\matchstatuson.gif , and then click the Classify Data button on the main toolbar images\classifydatabutton.gif .

 

Enrich

B-cells are a relatively small percentage of the cells in this sample, less than 3%. That can make things difficult to visualize. So we’ll use a feature called Enrich Data to search out all of the B-cells (those that have high CD19 and low SSC) and select them exclusively. We do this after adjusting our selection parameters and before we move on to the parameters that transition during the progression.

On the main toolbar, click the Enrich Data button. Now we’ve got plenty of B-cells to analyze.

 

images\model103enriched.gif

 

Model Building Tip: Use Enrich Data to See More of the Cells of Interest

Using Enrich Data allows you to include for consideration only those events that match the current model. This allows concentration of events that represent a small fraction of the entire sample.

 

Adjust and Enable TdT

Click the shrink tool for the TdT Parameter Profile plot to expand the plot.

Here we can see two bands of events: a sparse band of high-intensity and a dense band of low-intensity TdT. Let’s move our Control Points down to the approximate centers of the bands.

Adjust the control points down as shown:

 

images\model103tdtadjust.gif

 

Model Building Tip: Place Definition Points at the Right Intensity

When determining the initial position for the definition points, it is most important to get close to the right intensity level. The definition point should be placed near the center of the band of data it represents. The line spread should be set to approximate the variance of the data. The state index position of the definition points can be estimated based on the density of the dots at the given intensity.

 

Use the Estimate X Positions adjuster tool for TdT to fine-tune the control points’ positions.

 

images\model103adjustx.gif

 

The profile now shows the step-down progression we designed into our model.

 

images\model103tdtafteradjust.gif

 

At this point, we have selected B-cells and created the first structure in the progression backbone using the TdT profile. GemStone has re-ordered the events so that the high-intensity TdT events are at the start of the progression, the low-intensity TdT events are at the end of the progression, and intermediate TdT events are in the transition between high and low. This order is reflected in all of the parameter profile plots, and we start to see the hidden structure appearing in the other parameters.

 

Adjust and Enable CD10

The CD10 plot shows 3 levels of intensity, just what our model is designed for. The first level of high-intensity events is clustered at the beginning of the progression. Then we see a slightly lower intensity band of dense dots, and a diffuse band of low intensity events.

 

images\model103cd10.gif

 

Once again, we’ll adjust the positions of the Control Points to get a good starting estimate. Move the controls approximately as shown, paying attention primarily to the Y-axis positioning to place the points in the middle of the bands.

 

images\model103cd10adjust.gif

 

Now here is a new trick. Notice that the width of the CD10 negative band of dots is significantly wider than the intermediate and high CD10 bands. We can help GemStone out here by adjusting the line spread for that low-intensity band to be wider than the other bands. Here’s how we do that.

Click the 5th Control Definition point in the plot. This will select the point.

 

images\model103select5_6.gif

 

Now position the mouse over the 5th point and use the mouse scroll wheel to adjust the line-spread wider. If you do not have a scroll wheel on your mouse, use the Line-spread slider in the Parameter Profile panel.

 

This adjustment will make it easier for the auto-adjusting functions. Use the Estimate X Positions adjuster tool for TdT to fine-tune the control points’ positions.

 

images\model103adjustx.gif

 

 

The finished result has matched our data well.

 

images\model103cd10adjusted.gif

 

Using the Frequency Plot

How do we know that the model is really matching the data well? We use the Frequency Plot to help make that assessment. This is the bottom element in the Cell Type widget. There are 3 things we examine at on this plot to determine how well the model is working.

At the top, RCS shows us the reduced chi square value, which is a statistical measure of how well the model matches the data. A value approaching 1.0 is just about perfect. Typical values can be between 3 and 7 with more complex models.

 

images\frequencyplot.gif

 

Across the plot is a jagged black like showing us the frequency distribution for each state in our model. There is also a horizontal red line showing the ideal distribution of events. The ideal distribution simply takes the number of events we’re working with and divides that by the number of states in the progression. A Probability State Model attempts to distribute events in this manner, so this is ideally what our model would do. The frequency distribution line shows how our model is actually distributing events, and it should bounce above and below the ideal line as we see in our model so far. The smoothed frequency shows a weighted average frequency distribution. See the discussion on what is a Probability State Model for more on this.

So, our model is looking pretty good at this point.

 

Adjust and Enable CD20

The CD20 plot shows two levels of intensity.

Move the Control Points to the vertical center of the dense bands of dots. Adjust the line-spread for the control points. Then click the adjuster button.

 

images\model103adjustx.gif

 

This gives us a pretty good fit with the data. We can see an up regulation that then drops off again, which is what our Three Levels profile is designed for. However, this Step Up profile is also providing a very good fit, so we’ll keep it. We can always come back and change it to a Three Level profile later.

 

images\model103cd20revised.gif

 

 

Add the Remaining Parameters

This data file has many more parameters than we are currently modeling. What do you suppose will happen if we display the remaining parameters without defining any expectations in our model? Will there be any structure in them, or will they be evenly distributed bands of dots on the profile plots? Let’s find out.

In the Cell Type Properties panel, click the Edit button for Select Parameters. Select all of the “Live” parameters in the list on the left side of the dialog (even the ones we’ve already added), and then click the Add button.

 

images\model103selectall.gif

 

Click OK to close the dialog.

Now with 11 parameters loaded, the best way to see the other parameters is to look at the Parameter Overlay plot.

 

images\model103overlay.gif

Even without creating a model for CD45, CD43, CD34, CD23, and others, we can see how these parameters behave with B-cells in our progression. Our model defined the progression well enough that the other parameters reveal information without any effort on our part. For example, we can see that CD45 is lowest at the start of the progression, it increases in zones I and II, and then levels off in zone III. It elevates again at the end of III and levels off in IV and V.

 

Saving our Model

We’ve made several enhancements to our B-cell model. This model is now capable of analyzing real data, so we should save a new copy of it.

From the main toolbar, click Save Document. Type “Model103.gs” for the file name and click Save.

 

Other Enhancements

If you are familiar with B-cell maturation, you know that at some point along the way a B-cell will either express Kappa or Lambda – not both. When you have a branch like this, you generally need more than one Cell Type to properly characterize the process. This will be the focus of Model Building 104, the last in our series.

 

See also:

Model Building 104

Cell Types

Cell Type widget

Parameter Profile Descriptions