The first class to be detailed here is the ScanFileHolder. This class acts as storage for a single scan, the idea behind this is that multiple scans can be loaded in, and then either checked or compared as is required. The basic functionality shown here can all be incorporated into scripts or used as is in the jython terminal. All the examples shown here are for use in the Jython terminal, and are based on some scans being performed on simulated or real components.
The Data Analysis and Visualisation toolkit for the GDA is designed to make the visualisation and manipulation of all data collected on the beam-line quick and effective. This documentation will concentrate on demonstrations and examples, so having a working copy of the GDA attached to the DLS network would be useful. Lets Start with a brief example of how to load in a scan file and visualise the data. To Start with the following script is an example of loading in a previous scan and visualising some of the data.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | >>> data = ScanFileHolder()
>>> data.loadSRS()
>>> data.info()
Trying to open /home/ssg37927/gda/gda-7.4/users/data/70.dat
ScanFileContainer object containing 41 lines of data per DataSet.
0 SampleYaw
1 detect1
2 detect1
3 time
4 I0
5 It
6 If
7 TEY
8 PEY
9 FY
10 Drain
>>> data.plot(0,1)
|
This should display the following in the “Data Vector” tab in the GDA:
Remember that you can undock the “Data Vector” tab and put it wherever is easiest to see it, such as on another screen or beside the Jython terminal.
Going through the script line by line:
One of the functions available in the Analysis and Visualisation toolkit is the ability to quickly look at collected data from within the GDA. In the previous example we saw a simple plot of 1 set of collected data against another, but there is other functionality available. The following script uses Various different forms of the print command. It Also makes use of another version of the “loadSRS” command. This only works however on scans which have been recorded in the current directory and won’t work if the data directory has been changed since the scan was taken.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | >>> data = ScanFileHolder()
>>> data.loadSRS(-13)
>>> data.info()
ScanFileContainer object containing 101 lines of data per DataSet.
0 SampleYaw
1 detect1
2 detect1
3 time
4 I0
5 It
6 If
7 TEY
8 PEY
9 FY
10 Drain
11 Channel_0
12 Channel_1
13 Channel_2
14 Channel_3
15 Channel_4
16 Channel_5
17 Channel_6
18 Channel_7
>>> data.plot(0,1)
>>> data.plot("SampleYaw",11)
>>> data.plot("SampleYaw",[11,"Channel_3",1])
|
Should provide the following outputs:
The next section shows a lot of information about manipulating data once it is in the form of a dataset. The ScanFileHolder is essentially a container for lots of different datasets, and getting them out to look at more closely can be done in several ways.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | >>> data = ScanFileHolder()
>>> data.loadSRS(-31)
>>> data.info()
ScanFileContainer object containing 21 lines of data per DataSet.
0 SampleYaw
1 detect1
2 time
3 I0
4 It
5 If
6 TEY
7 PEY
8 FY
9 Drain
10 Channel_0
11 Channel_1
12 Channel_2
13 Channel_3
14 Channel_4
15 Channel_5
16 Channel_6
17 Channel_7
>>> dataset1 = data.getAxis("detect1")
>>> dataset1.disp()
DataVector Dimentions are [21]
[-5.5897e-02, 2.4862e-02, 2.9554e-01, 2.7648e-01, ...
>>> dataset1 = data.getAxis(10)
>>> dataset1.disp()
DataVector Dimentions are [21]
[1.9380e+01, 1.4850e+01, 2.1220e+01, 2.9110e+01, ...
>>> dataset1 = data[11]
>>> dataset1.disp()
DataVector Dimentions are [21]
[2.2580e+01, 1.4330e+01, 2.4240e+01, 3.1930e+01, ...
|
25-26 The output from the disp() command
The table provided here provides in full detail the main functions of the ScanFileHolder Class
The DataSet class is the core of all the fitting and visualisation architecture presented here. It is a N dimensional expendable container, which can be manipulated and visualised in many different ways. As was seen in the section on ScanFileHolders, the internals of these objects are datasets. In this section we will detail some of the uses of the dataset including manipulation and direct plotting.
The DataSet is the main way of handling sets of data in the GDA. There are many different ways of manipulating the dataset data. The following script demonstrates a few of these methods, as well as showing how to get the datasets out of a ScanFileHolder.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | >>> data = ScanFileHolder()
>>> data.loadSRS(1)
>>> data.info()
ScanFileHolder object containing 26 lines of data per DataSet.
0 SampleYaw
1 detect1
2 detect2
>>> dataset1 = data[1]
>>> Plotter.plot("Data Vector",data[0],dataset1)
>>> dataset1.max()
1.0630006379056933
>>> dataset1.min()
-0.9609246487585549
>>> dataset1 -= dataset1.min()
>>> dataset1 /= dataset1.max()
>>> dataset1.max()
1.0
>>> dataset1.min()
0.0
>>> Plotter.plot("Data Vector",data[0],dataset1)
|
The idea here is to normalise the data between O and 1, and then to plot it out. This should Produce graphical output as shown below.
Sometimes it can be useful to see the differences or similarities between two scans, or two detectors in the same scan. The following script highlights the noise between two different signals and evaluating that difference statistically.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | >>> data = ScanFileHolder()
>>> data.loadSRS(1)
>>> data.info()
ScanFileHolder object containing 26 lines of data per DataSet.
0 SampleYaw
1 detect1
2 detect2
>>> dataset1 = data[1]
>>> dataset2 = data[2]
>>> Plotter.plot("Data Vector", data[0], [dataset1, dataset2])
>>> dataset3 = dataset1-dataset2
>>> Plotter.plot("Data Vector", data[0], [dataset1, dataset2, dataset3])
>>> dataset3.mean()
0.0067027887342955145
>>> dataset3.rms()
0.06266747949694666
>>> dataset3.skew()
0.5610499636907474
>>> dataset3.kurtosis()
-0.5355342349265815
|
This script looks at the difference between 2 detectors which are scanned in the same scan. This plots out a the data to make sure it is correct, and then to visualise the difference.
Table6.7.ScanFileHolder Method Listing Method Inputs Outputs Description Constructor DataSet Makes a copy of the dataset given as the input Constructor double[] Generates a 1D dataset containing the data provided as the input Constructor int[] Generates a Dataset with the dimensionality of the length of the input array, and size per dimension of the input array value for example DataSet(2,3,4) would create a 3D dataset with sizes 2 in the first, 3 in the second and 4 in the third dimensions. Constructor int w, int h, double[] data Makes a new 2D dataset with height h, width w, and filled with the data. This is filled quickest along the width. Constructor int w, int h, int d, double[] data Makes a new 3D dataset with height h, width w, depth d and filled with the data. This is filled quickest along the width. then along the height. Constructor JAMA Matrix Creates a 2D dataset which has the same data and proportions of the input JAMA Matrix abs DataSet returns the Absolute values of each element of the dataset in a new Dataset which is returned centroid double returns the centroid value of the dataset, this is effectively the point along the dataset which is the centre of mass of all the values. chiSquared DataSet double This function compares the dataset element by element with the input dataset. The differences between them are then squared and summed, and this is the value that is returned. cos DataSet returns the cosine of every value in the dataset, as a new dataset sin DataSet returns the sine of every value in the dataset, as a new dataset exp DataSet returns the exponential of every value in the dataset, as a new dataset ln DataSet returns the natural log of every value in the dataset, as a new dataset log10 DataSet returns the log base 10 of every value in the dataset, as a new dataset pow double DataSet returns each value in the dataset raised to the power given to the function, as a new dataset norm DataSet returns a new dataset where all the values are normalised to between 0 and 1, and scaled appropriately lnnorm DataSet returns a new dataset where the natural log of all the values normalised to between 0 and 1, and scaled appropriately lognorm DataSet returns a new dataset where the log base 10 of all the values are normalised to between 0 and 1, and scaled appropriately max double returns the maximum value of the dataset. maxPos int returns the position of the maximum value of the dataset. min double returns the minimum value of the dataset. minPos int returns the position of the minimum value of the dataset. mean double returns the mean value of the dataset. rms double returns the Root Mean Squared value of the dataset. skew double returns the skew of the dataset. kurtosis double returns the kurtosis of the dataset. diff DataSet Calculates the differential of the dataset and returns it as a new dataset. This makes the assumption that all the points are 1.0 apart. diff int n DataSet Calculates the differential of the dataset and returns it as a new dataset. This makes the assumption that all the points are 1.0 apart. n is the number of points from each side that are taken as an average to reduce noise. diff DataSet DataSet Calculates the differential of the dataset and returns it as a new dataset. The input dataset is the x coordinates for the calculation. diff DataSet, int n DataSet Calculates the differential of the dataset and returns it as a new dataset. The input dataset is the x coordinates for the calculation. n is the number of points from each side that are taken as an average to reduce noise. disp Displays the datasets contents to the jython terminal doubleArray double[] Returns all the data in the dataset as a single double array doubleMatrix double[][] If the Dataset is 2D, this returns the data as an array of array of doubles getJamaMatrix double[][] If the Dataset is 2D, this returns the data as a JAMA Matrix get int[] double returns the data at the location specified by the input. set double value, int[] double sets the data at the location specified by the input with the input value. getDimensions int[] returns the dimensionality and size of the dataset as an array of integers
A basic peak fitting routine in the Analysis toolkit is demonstrated in the following script.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | >>>data = ScanFileHolder()
>>>data.loadSRS(1)
>>>data.plot(0,1)
>>>output = Fitter.plot(data[0],data[1]+1,GradientDescent(0.0001),[Gaussian(0.0,10.0,10.0,10.0),Gaussian(0.0,10.0,10.0,10.0)])
>>>print output.disp()
7.883976029749007(0.0,10.0)
2.9464538192926537(0.0,10.0)
6.4733892793832455(-10.0,10.0)
1.56093364851906(0.0,10.0)
2.936788227366031(0.0,10.0)
6.420297623699959(-10.0,10.0)
>>>output[1].getValue()
2.9464538192926537
>>>output[1].getUpperLimit()
10.0
>>>output[1].getLowerLimit()
0.0
|
The images show below are of the base data, and then the fitted data. The green line represents the line of best fit through the function, with the other lines showing the individual functions which make up the fit. At the bottom of the plot there is a reference line, and some detail showing how far from the data the fit is. This provides a quick visual representation of the error in the fit.
Table6.9.Fitting Methods Available Method Description MonteCarlo This method uses monte carlo methods to solve the problem. The Value entered when creating this is currently abstract, a smaller number is more accurate, and a larger number is quicker. 0.001 is a good start point for this. GradientDescent This method uses the gradient Descent method to optimise the problem. The differential of the configuration space is taken, and the direction found is stepped along by a certain amount. If the distance stepped causes the objective function to increase (a negative result) then the distance is reduced until a positive result is obtained. When the distance stepped to obtain a positive result is reduced to less than the argument value, the search ends. GeneticAlg This method uses differential evolution genetic algorithms to perform the search. This creates a group of agents which each perform a search over the configuration space to try to find good solutions. This is a slightly slower method, but performs much better on harder solutions. The value entered here is the stop criteria. When the objective function average of all the agents, is less than this value from the minimum agents objective function calculation, then the search will stop.
Table6.10.Fitting Functions Available Function Arguments Function Parameter Outputs Cubic minA, maxA, minB, maxB, minC, maxC, minD, maxD y=Ax^3+Bx^2+Cx+D A, B, C, D Gaussian MinPosition, MaxPosition, FWHM, Area Position, FWHM, Area Lorentzian MinPosition, MaxPosition, maxFWHM, maxArea Position, FWHM, Area Offset minA, maxA y=A A PseudoVoigt minPos, maxPos, maxFWHM, maxArea Position, FWHM, Area Quadratic minA, maxA, minB, maxB, minC, maxC y=Ax^2+Bx+C A, B, C StraightLine minA, maxA, minB, maxB y=Ax+B A, B