Data analysis

We'll show you how it's done!

How do I load data?

Data is the central foundation when it comes to data analysis. Therefore the question is absolutely understandable. To load data you use the load command together with the file name. The data will be loaded into the table data(), but you can also create a table that matches the filename or the name of the embedded table (for *.ndat files) if you use the totable option.

The data must be in tabular form and should be purely numeric. Strings that cannot be interpreted as numeric values are added to the column header. File formats that are supported include *.ndat, *.txt, *.csv, *.xls and *.xlsx.

How do I display data graphically?

A good first step to analyze data is to simply display the loaded data graphically to get a first overview of the data. It can also help if you display the column headers of the data. This will give you an idea of the format of the data. For illustration we use the dataset <scriptpath>/examples/amplitude.dat, which you can find in your NumeRe distribution.

So let's load the data and display the column headings. This can be done by using the # character on the dataset as the row index: data(#, :). This should then be

ans = { "omega", "b=3", "b=5", "b=0"}

in the terminal. So we know that there is a common column omega and three measurements b=3 to b=0. With this information we can use the correct plot command.

To create an ordinary line or point plot, you use the plot command followed by the expressions or data sets to be plotted. In our case this is

plot data(:, 1:2), data(:, 1:3), data(:, 1:4)

A new window appears showing the plot with the three data sets as disconnected points. So we can guess that these data are resonance measurements.

Another way to display data graphically is the boxplot and a histogram. The boxplot (enabled with the boxplot option) gives you a direct representation of the data width and where the main part of the data points are located (each crossbar divides the data interval into 25%, so the middle box contains 50% of all values). A histogram in contrast (which can be generated with the hist command) shows you directly the distribution of the data points as a summation bar. This can be used, for example, to recognize the skewness of the data.

What do I need to do to fit a function to my data?

Ah yes. The topic with the function adjustments. You can do as much wrong as you like, but you can also do a lot right. But we don't want to go into that topic here; there are enough sources on the web that describe the correct approach to fitting and give hints on what to do and what not to do. Instead, we will focus on how to use the NumeRe fit algorithm.

What you need in the first place is data. We'll use the previous example again, where we already established that we're talking about resonance measurements. As you may be aware, resonance curves are described by the Lorentz function:

lorentz(x, x0, amplitude, damping, offset) := amplitude / sqrt((x^2 - x0^2)^2 + 4*damping^2*x^2) + offset

If we define this function via define, we can use it comfortably in the fitalgorithm. The fit algorithm is started with the command fit, to which the selected data and by the option with the function to be fitted must be specified. You can also specify the parameters to fit, but if you don't, NumeRe will automatically use all variables except x and y as parameters:

fit DATA() -with=FUNCTION(x, ...)

fit DATA() -with=FUNCTION(x, ...) params=[...]

If you use the fit command successfully, it will automatically define your fit function in the function Fit(x), so that a graphical representation of the fitted function is no problem at all.

More to come ...