Deconvolution of DSC Data Using Python

Keywords: DSC, Deconvolution, JSON, Python, LMFit, SciPy

TB110

Abstract

This technical brief discusses an approach to the deconvolution and visualization of the deconvoluted data using Python® and it associated libraries. The JSON file and the Python script (in Jupyter® Notebook format) can be downloaded here.

Introduction

One of the challenges you may encounter is transitional peaks that overlap. It can then sometimes be necessary to mathematically deconvolute that data to develop a better understanding of the nature of that material. In general, this will involve using fitting models to represent the data.

Previous technical bulletins have discussed the ability to export TRIOS™ Software data files in a controlled JSON format [1], the import of this into Python and dataframe interaction [2], the visualization of the data [3], and introduced basic integration of a polymer melt peak [4]. This interaction with the data can be extended further to deconvolute multiple peaks.

Experimental

For this example of peak deconvolution, a 50:50 w/w mix of Polypropylene (PP) and High-Density Polyethylene (HDPE) was produced. A 15 mg sample was taken and run in the TA Instruments™ Discovery™ Differential Scanning Calorimeter (DSC) using the following heat – cool – reheat method:

  1. Equilibrate -50.00 °C
  2. Ramp 10.00 °C/min to 190.00 °C
  3. Isothermal 5.0 min
  4. Ramp 10.00 °C/min to -50.00 °C
  5. Isothermal 5.0 min
  6. Ramp 10.00 °C/min to 190.00 °C

The result file was exported from TA Instruments TRIOS Software in a JSON format and imported into Python as a data frame. The first heat was then copied to a separate dataframe on which the peak deconvolution was carried out.

Figure 1 shows the plot of the first heating cycle, heat flow against temperature and, as might be expected, two melt peaks can be observed. The lower one relating to the melt of the HDPE and the higher to the melt of the PP.

Figure 1. First heat data of the 50:50 HDPE:PP polymer blend
Figure 1. First heat data of the 50:50 HDPE:PP polymer blend

Curve Fitting

For the curve fitting and deconvolution the LMfit library was used [5] for non-linear least squares minimization and curve fitting in Python. As with many of the Python libraries, it was built on an existing library – in this case, it was built on and extends many of the optimization methods of SciPy.optimize library. It contains a wide selection of predefined models that can be easily used but also allows you to build your own models.

Python Script

The Python script will:

  • Import the required libraries
  • Import the data file being investigated and create the dataframe (also possible to plot this data)
  • Define any required functions
  • Define the fitting functions
  • Minimize the functions and review the data

During the development of the script, it may be useful to plot output results at important steps. These can be entered as required. It may also be useful to retain these in the script so checks can be made on each of the files in an experiment.

The first step is to import the required libraries. The bulk of the libraries to be imported are the same as discussed previously for integration [4]. However, in addition, the LMfit library [5] is being imported for the deconvolution. In this example, the PearsonIV model was used; this was used previously [6] when investigating peak deconvolution through commercial software.However, it is worth noting that LMfit offers a wide range of models that can be used including the ability for you to define your own models.

The data is then imported and the dataframe created:

From the TRIOS plot or plotting this data (Figure 1) you can see that the region of interest is between 0 °C and 185 °C. A second dataframe can be created from the data within that region.

In general, when deconvoluting data, this is easiest done from a zero baseline. The heat capacity or thermodynamic baseline is subtracted from the data; this is effectively the heat flow signal where there are no transitions, just the heat flow due to the sample’s heat capacity.

A function is generated, which produces a linear baseline based on the first and last data points in the peaks_df dataframe.

This is then executed to find the baseline heat flow data which is also appended to the peaks_df dataframe.

This baseline data is then subtracted from the heat flow signal and the baseline subtracted data is also added to the peaks_df dataframe.

This data was plotted to show the effect of the baseline subtraction as shown in Figure 2.

Figure 2. Overlay of original and baseline subtracted data
Figure 2. Overlay of original and baseline subtracted data

The resultant Baseline Subtracted Heat Flow signal can then be used for the deconvolution.

The next step is to set up the fitting model. As noted, previous investigations [6] utilized a commercial numerical analysis package employing a Pearson IV model, which showed a good fit to the polymer melt data. The same model is utilized in this investigation.

As there are two events, two fitting functions are created. Each fitting function is defined by a number of parameters. The main functions are defined as an array (p41 and p42 in this case).

The prefix value shown is used to identify the parameters for each of the fitting functions.

To create a starting point for the fitting optimization, initial guesses are made on some of the parameters. It is worth noting, however, that if multiple files are being investigated of a similar nature (for example mixes of the same polymers) then the initial guesses should hold for all data.

While there are a number of parameters it is not usually necessary to give initial settings for all of them. In general, for polymer melting it is likely that only an initial guess value for the peak temperature is required.

For this example, guesses for the peak value (center) and the peak height (amplitude) were made.

The complete model will be the sum of the two separate models and so will be a combination of all the parameters. A summed fit is then also generated. It is the minimization of this data that will define the final parameters.

The initial guess can be plotted, as this is based on the guess parameters it is likely to look wrong. The eval() command essentially evaluates the model based on the initial guess parameters set. Figure 3 shows the plot of this initial guess.

Figure 3. Plot of the initial guess fit data
Figure 3. Plot of the initial guess fit data

The model parameters can then be optimized to fit the original data (as closely as possible within the bounds of the model selected).

The model.fit fits the model to the original data and gives the optimized components, which are then saved in the “results” array. This contains the parameters for both fitting models. The “comps” is a dictionary of the fit data generated.

The final resultant combined model can then be plotted and compared with the original data.

This is shown in Figure 4.

Figure 4. Comparison of the original DSC data (baseline subtracted) and the curve fitted data
Figure 4. Comparison of the original DSC data (baseline subtracted) and the curve fitted data

To complete the analysis, the separate models are plotted. For convenience a new dataframe is created and the line indexes are reset back to zero.

A dataframe is created from the comps dictionary.

The dataframe of the fitted data contains two columns of data, one for each of the fitted peaks. These are extracted and added to the final_df dataframe.

In this example, the baseline heat flow was added back to the baseline heat flow. In this way the deconvoluted data could be overlaid with the original heat flow data.

The area (enthalpy change) of the two deconvoluted peaks was also calculated:

A final plot was then generatedthat took the original data and the two deconvoluted melt peaks.

The final plot with the reported enthalpy changes is shown in Figure 5.

Figure 5. Overlay of original melt peak and the deconvoluted peaks
Figure 5. Overlay of original melt peak and the deconvoluted peaks

Conclusions

This technical brief presented an overview of the approach to signal deconvolution of differential scanning calorimetry data. The approach can be extended to other data sources where required.

References

  1. P. Davies, “TB105: TRIOS and JSON Export: Overview of JSON Export from TRIOS and Import into Python,” TA Instruments, New Castle, DE.
  2. P. Davies, “TB106: TRIOS and JSON Export: Basic Interaction with the Dataframe in Python,” TA Instruments, New Castle, DE.
  3. P. Davies, “TB107: TRIOS and JSON Export: Working With Matplotlib.Pyplot in Python,” TA Instruments, New Castle, DE.
  4. P. Davies, “TB109: TRIOS and JSON Export: Integrating Differential Scanning Calorimetry Data,” TA Instruments, New Castle, DE.
  5. Non-Linear Least-Squares Minimization and Curve-Fitting for Python — Non-Linear Least-Squares Minimization and Curve-Fitting for Python
  6. G. Slough, “TA431 – Deconvolution of Thermal Analysis Data using Commonly Cited Mathematical Models,” TA Instruments, New Castle, DE.

Acknowledgement

For more information or to request a product quote, please visit www.tainstruments.com to locate your local sales office information.

This paper was written by Philip Davies, Principal Applications Scientist, TA Instruments.

Python is a registered mark of Python Software Foundation. Jupyter is a trademark of LF Charities, of which Project Jupyter is a part. pandas, Matplotlib, NumPy, SciPy are trademarks of NumFOCUS, Inc. TA Instruments, Discovery, and TRIOS are trademarks of Waters Technologies Corporation.

Click here to download the printable version of this application note.

Contact us to learn more about our instrumentation and how it can benefit your research.