ThreeB 1.1
|
00001 ///@file documentation.h Doxygen documentation bits 00002 00003 /// @defgroup gDebug Useful debugging functions. 00004 00005 /// @defgroup gUtility General utility functions. 00006 00007 /// @defgroup gHMM Generic hidden Markov model solver. 00008 /// 00009 /// The notation follows `A tutorial on hidden Markov models and selected applications in speech recognition', Rabiner, 1989 00010 /// 00011 00012 /// @defgroup gPlugin Classes related to the ImageJ Plugin 00013 00014 /// @defgroup gStorm Storm classes 00015 00016 // @defgroup gMultiSpotDrift Storm classes specific to multispot processing with drift 00017 00018 /// @defgroup gStormImages Storm imagery classes (basic image processing) 00019 00020 // @defgroup gGraphics Graphical display 00021 00022 // @defgroup gUserInterface General user interface classes 00023 00024 /// @file multispot5.cc Fit spots to the data 00025 00026 /// @file multispot5_headless.cc FitSpots driver for entierly headless (batch) operation 00027 00028 /// @file multispot5_gui.cc FitSpots driver for interactive (GUI) operation and debugging 00029 00030 ///@file mt19937.h Mersenne twister interface code 00031 00032 ///@file randomc.h 00033 00034 ///@file mersenne.cpp Agner Fogg's Mersenne Twister implementation. 00035 00036 /// @file debug.h Debugging bits 00037 00038 /// @file debug.cc Debugging bits 00039 00040 /// @file utility.h Utility bits. 00041 00042 /// @file utility.cc Utility bits. 00043 00044 /// @file storm_imagery.cc Code dealing with storm imagery (low level). 00045 00046 /// @file storm_imagery.h Code dealing with storm imagery (low level). 00047 00048 /// @file storm.h Code dealing with storm imagery (high level). 00049 00050 /// @file forward_algorithm.h Contains an implementation fo the forward algorithm. 00051 00052 00053 /// @defgroup gMultiSpot Storm classes specific to multispot processing 00054 // 00055 // Having completely smooth drift is too memory intensive. 00056 // 00057 // The drift model is linear, approximated as piecewise constant. Generally there 00058 // are <i>F</i> frames and these are divided up into <i>S</i> steps. The equation 00059 // for finding the step \e s from the frame \e i is: 00060 // \f[ 00061 // i s = \lfloor \frac{f S}{F} \rfloor. 00062 // \f] 00063 // The inverse is defined to be: 00064 // \f[ 00065 // f \approx \frac{s + \frac{1}{2}}{S}F 00066 // \f] 00067 // 00068 00069 00070 /** \mainpage 3B Microscopy Analysis 00071 00072 \section sIntro Introduction 00073 00074 This project contains the reference implementation of the 3B microscopy analysis method, 00075 and an ImageJ plugin. 00076 Please refer to <a href="http://www.coxphysics.com/3b">the project website</a> for more information 00077 on the method. 00078 00079 To get started with analysing data, the <a href="http://rsbweb.nih.gov/ij/">ImageJ</a> plugin is 00080 the most suitable piece of software. This can be obtained from the <a href="http://www.coxphysics.com/3b">the project website</a>. 00081 00082 For more advanced analysis (such as running on a cluster), the 00083 the program \c multispot5_headless should be used. 00084 This program needs to be run from the commandline. 00085 00086 This project contains the source code for the commandline program and the ImageJ plugin. 00087 00088 \section sStartHere Getting started 00089 00090 \subsection sdata Dataset types and experimental parameters 00091 00092 Bayesian analysis of blinking and bleaching allows data to be extracted from 00093 datasets in which multiple fluorophores are overlapping in each frame. It can 00094 of course also be used to analyse standard localisation (PALM/STORM) data, but 00095 it must be borne in mind that the low density and high frame number of such 00096 datasets can lead to long runtimes. Here we briefly discuss the different types 00097 of dataset, related to different applications, that you may wish to analyse 00098 using 3B. 00099 00100 \subsubsection slow Low density PALM/STORM datasets 00101 00102 These datasets have few fluorophores overlapping 00103 in each image and are at least 10,000 frames long. As discussed above, they can 00104 be analysed with 3B but their run time makes this a large time investment. If 00105 you wish to use this approach for performance verification, we would suggest 00106 selecting a small spatial area (around 1.5-3\f$\mu\f$m square). The algorithm can 00107 also be parallelised by running different sets of a few hundred frames 00108 for the same area 00109 separately. Even with parallelisation, the large number of 00110 frames will make it time consuming to run. If you simply want to know what the 00111 structure is like, we would suggest using a method such as QuickPALM, which are 00112 very fast in comparison. 00113 00114 \subsubsection shigh High density fixed cell datasets 00115 00116 These datasets are of fixed cells but have 00117 multiple fluorophores overlapping in each frame. You may acquire this type of 00118 dataset if the system you use has a fluorophore which cannot be photoswitched, 00119 or the blinking properties of which cannot be tuned over a wide enough range 00120 using the embedding medium, or if your light source is not powerful enough to 00121 drive most of the fluorophores into the non-emitting state. In fixed samples 00122 labelled with fluorescent proteins there will in our experience be almost no 00123 blinking present, and 3B will therefore pick up bleaching. While this can 00124 produce satisfactory results, the localisation is on single event on a high 00125 background and therefore the performance will be significantly degraded. 00126 00127 Some users choose to acquire this type of dense data if they have severe 00128 problems with drift, particularly in the z-direction, as it drastically cuts the 00129 drift over the acquisition time. However, in the long term it is worth trying 00130 to improve the stability of such systems, since z-drift will impact the accuracy 00131 of all types of localisation measurement. If you are unsure how badly your 00132 system is affected by z-drift, it is useful to carry out a calibration of the 00133 drift of the microscope using a bead sample before starting acquisition of data 00134 for superresolution. 00135 00136 \subsubsection slive Live cell datasets 00137 00138 These datasets are of live cells, generally labelled with 00139 standard fluorescent proteins such as mCherry. The mounting medium should be 00140 phenol red free, to avoid unnecessary background. The intensity of the light 00141 source should be selected such that it is high enough to produce blinking but 00142 not strong enough to completely bleach the sample over the time of the 00143 acquisition. For example, for a standard Xe arc lamp illumination with the full 00144 power of the lamp is generally suitable, but if you are using a powerful laser 00145 source it is recommended to take multiple datasets with different power levels to 00146 determine which is suitable. 00147 00148 The time which is needed to acquire the data necessary for a single 00149 superresolution frame is dependent on the illumination intensity, the speed of 00150 the camera, and the properties of the flurophore. Many older EMCCD cameras have 00151 a maximum acquisition speed of around 50 frames per second for a small region of 00152 interest, which then gives a limit of 4 s to acquire 200 frames. High 00153 specification EMCCD cameras and SCMOS cameras have much higher acquisition 00154 speeds of up to thousands of frames per second for restricted areas. However, 00155 in order to maintain the number of photons from each flurophore per frame the 00156 illumination intensity would have to be increased, which is likely to bleach the 00157 sample rapidly, and change the blinking properties of the fluorophore to some 00158 extent. 00159 00160 The acquisition of live cell datasets allows dynamics to be observed. It should 00161 be noted, however, that if the structures move over the timescale of the 00162 acquisition then that movement will cause blurring in the reconstructed image. 00163 00164 00165 \subsubsection slive Selection of appropriate cell structures for observation 00166 00167 The 3B algorithm must 00168 be able to pick up the changes in intensity which occur when a fluorophore 00169 switches between an emitting and a non-emitting state. The accuracy with which 00170 localisation can occur depends, as for other localisation techniques, on the 00171 number of photons from the fluorophore and the background level. Since 3B is 00172 generally used with a widefield setup, if the sample has a lot of fluorescent 00173 structure out of the plane of focus, the background will be higher and it will 00174 be more difficult to localise. For thick samples, the background can be reduced 00175 by using TIRF or high angle illumination. 00176 00177 00178 00179 \subsection sanalysis Iterations and run time 00180 00181 00182 In general, determining the number of iterations required for MCMC algorithms 00183 is an unsolved problem. A good general rule is that once the reconstruction 00184 stops changing significantly with increasing iterations, then it is likely that 00185 the reconstruction has converged to a reasonable point. 00186 00187 If you are unsure, then rerun exactly the same area with exactly the same 00188 parameters, but with a different random seed. Note that the ImageJ plugin will 00189 automatically select a different seed each time in the standard interface, but 00190 with the advanced interface or commandline program, the seed must be specified 00191 in the configuration file (see \ref sconfig). If the two results appear 00192 essentially the same, then it is very likely that a sufficient number of 00193 iterations has been reached. 00194 00195 A good general rule is that 200 iterations is sufficient for convergence under 00196 almost all circumstances. For the example usage given in this manual, the 00197 required number of iterations for good convergence requires about 6 hours on a 00198 standard PC (Core i7 at 3GHz). 00199 00200 Convergence may be achieved before 200 iterations, but 00201 terminating the run before convergence can lead to artefacts. As with any 00202 microscopy method, the experiment and analysis should always be carried out 00203 appropriately to minimise the risk of artefacts. 00204 00205 There are a number of issues which can lead to artefacts: 00206 - Early termination of the algorithm 00207 - The algorithm builds up a reconstrutcion using a number of random samples. 00208 If the algorithm is terminated too early, then the reconstruction will be 00209 dominated by the randomness, and there will not be enough samples for random 00210 fluctuations to average out. 00211 - MCMC algorithms may exhibit a property called <a 00212 href="http://en.wikipedia.org/wiki/Metropolis%E2%80%93Hastings_algorithm">burn 00213 in</a>. As the 3B algorithm runs, it maintains an estimate of the number of 00214 spots present in the image. Since it is an MCMC algorithm, this estimate 00215 will fluctuate around some mean value. However, 3B can add or 00216 remove spots at a rate of at most 5 per iteration (usually much slower). If 00217 the algorithm is started with a very bad estimage of the number of spots 00218 then it may take many iterations for it to approach the mean. During this 00219 time, the samples drawn will not be representative. 00220 \n 00221 3B must therefore run for enough iterations that the later, representative 00222 iterations will dominate and swamp the earlier ones. If 3B is started using 00223 a very bad estimate, then this can require substantially more than 200 00224 iterations. 00225 00226 - Bright regions close to the boundary 00227 - The 3B algorithm will not examine any pixels outside the boundary of the 00228 marked region. If there is a bright region of the image on or near the 00229 boundary, then 3B will naturally attempt to places fluorophores there. However 00230 since the point spread function of the microscope is typically several pixels across, the images of fluorophores near 00231 the boundary will extend across it and so will be missing information, which 00232 could lead to artefacts. 00233 \n 00234 If possible, placing the boundary near a bright region should be avoided. If 00235 this is not possible, then the reconstruction close to the boundary should 00236 be ignored. Anything further than the PSF diameter from the boundary is 00237 unlikely to be affected. Typically this is around 3 pixels. 00238 00239 - Insufficient background regions 00240 - The 3B algorithm needs to be able to estimate the babkground noise level in 00241 order to model the image correctly. If there is an insufficient amount of 00242 background (for instances if the images are too small), then the estimation 00243 of the noise level may be poor which could lead to artefacts. 00244 \n 00245 The sample data provided is an example of data for which the background can 00246 be estimated effectively. 00247 00248 - Image drift or motion 00249 - The 3B algorithm does not model image motion. 00250 \n 00251 If there is significant motion for example due to drift or a live cell 00252 moving then artefacts may result. The artefacts take the form of streaking 00253 or smearing in the direction of drift, or structure bunching randomly at one 00254 end of the drift or the other. 00255 \n 00256 Ideally the experiment should be run to minimize drift, but if this is not 00257 possible, then drift correction software (for example based on tracking 00258 beads) should be used to correct the drift. For live cell analysis, a tradeoff can be made between spatial and temporal resolution. If fewer frames are analysed, the temporal resolution is higher but the spatial resolution will be degraded. Varying the number of frames analysed can also be used to investigate the impact of sample movement in live cells, if this is a concern. 00259 00260 - Incorrect parameters 00261 - If the parameters (especially the FWHM of the point spread function of the microscope) are set incorrectly then 00262 3B will not be able to model the image correctly, which may lead to poorer 00263 resolution or artefacts. The FWHM can be readily determined by taking a 00264 diffraction limited image of beads. 00265 00266 - Very high background levels 00267 - See \ref slive . 00268 00269 00270 \section sUsingPlugin Using the ImageJ Plugin 00271 00272 00273 A tutorial for using the ImageJ plugin is provided under 00274 <code>Plugins>3B>Help</code> 00275 00276 The plugin can operate in two modes, standard and advanced. The standard mode 00277 allows the user to set the microscope PSF and spot size, the starting number of 00278 spots for the analysis and the range of frames. 00279 00280 The plugin also offers an advanced mode of operation which allows much greater 00281 control over 3B. See \ref sconfig for further details. 00282 00283 \section sCMD Using the commandline program 00284 00285 We have provided a set of test data on the website. Download and unpack the zip 00286 file. It will create a new directory called test data with the collowing 00287 contents: 00288 @code 00289 test_data/AVG_test_data.bmp 00290 test_data/markup1.bmp 00291 img_000000000.fits 00292 img_000000001.fits 00293 img_000000002.fits 00294 ... 00295 img_000000299.fits 00296 @endcode 00297 00298 Then run the following command: 00299 00300 @code 00301 ./multispot5_headless --save_spots test_data/results.txt --log_ratios test_data/markup1.bmp test_data/img_000000* 00302 @endcode 00303 00304 The program will save the results in the file \c test_data/results.txt . The 00305 program will run indefinitely in the default setup, but you may view the 00306 results at any stage. There is no well defined stopping point for this type of 00307 algorithm, so it is advisable continuously monitor the resultant image, and 00308 stop the algorithm when the output image is no longer changing with time. After 00309 30 minutes on a fast PC (e.g. Core i7 975), the ring structure which is not 00310 resolved in the widefield image should be clearly visible. 00311 After about 75 mins, the finer details of the structure begin to approach those seen in Fig 2e 00312 in the associated paper. 00313 00314 The ImageJ plugin can load a results file and perform a reconstruction. 00315 00316 00317 Alternatively, you can process the results file further in order to view the results. 00318 Run the following command: 00319 @code 00320 awk '/PASS/{for(i=2; i <=NF; i+=4)print $(i+2), $(i+3)}' test_data/results.txt > test_data/coordinates.txt 00321 @endcode 00322 00323 The file <tt>test_data/coordinates.txt</tt> contains a long list of \f$(x, y)\f$ 00324 coordinates, representing possible spots positions. In order to view the 00325 results, load the data into a graph plotting program and create a scatter plot. 00326 NOTE: the axes are in pixel coordinates, so you will have to multiple any 00327 distances by the number of nm/pixel in order to get distances in nm. 00328 00329 00330 \subsection sExplaneExample Example usage explained 00331 00332 00333 \subsubsection ssTestdata Test Data 00334 00335 The 300 TIFF files in the test directory correspond to the data used for Fig. 2 00336 in the paper. Please refer to the paper for details on how the data was 00337 obtained. 00338 00339 The file <tt>AVG_test_data.bmp</tt> is a Z projection made using 00340 <a href="http://rsbweb.nih.gov/ij/">ImageJ</a>. 00341 00342 The file <tt>markup1.bmp</tt> is a mask indicating which area of the image to 00343 analyse. All perfectly black pixels are ignored, ecerything else is analysed. If 00344 you overlay <tt>markup1.bmp</tt> and <tt>AVG_test_data.bmp</tt> you can see 00345 which area the markup corresponds to. The markup file was created using 00346 <a href="http://www.gimp.org">the GIMP</a>. 00347 00348 \subsubsection ssRunning Running the program 00349 00350 The general form for running the program is: 00351 00352 @code 00353 ./multispot5_headless [ --variable1 value1 [ --variable2 value 2 [ ... ] ] ] image1 image2 ... 00354 @endcode 00355 00356 so the example sets ths variable \c save_spots to \c test_data/results.txt and 00357 the variable \c log_ratios to \c test_data/markup1.bmp. The remaining 00358 arguments is the list of files to be analysed. 00359 00360 The program gets the markup in the filename given in the \c log_ratios variable 00361 (yes, the choice of name is very strange, and corresponds to a very old phase of 00362 development). The more sanely named variable \c save_spots is the filename in 00363 which the output is to be saved. 00364 00365 The program actually has a large number of variables which must be set. Most of 00366 them you probably don't want to change, but some of them you will want to 00367 change. The default values for these variables are stored in \c multispot5.cfg 00368 The format of this file should be mostly self explanatory. Everything after a \c 00369 // is a comment and is ignored. See \ref sconfig for further details. 00370 00371 You will probably want to change: 00372 <ul> 00373 <li> \c blur.mu 00374 00375 This is the prior over spot size, which is how the pixel size and microscope 00376 FWHM are represented. Some example values for a FWHM of 300nm/pix at 160 and 00377 100 nm/pix and for a FWHM of 270nm at 79nm per pixel. 00378 00379 If you have significantly largre or smaller pixels, the performance may be 00380 degraded. 00381 00382 <li> \c placement.uniform.num_spots 00383 00384 This is the initial number of spots to be placed down. Eventually, the algorithm 00385 will converge to a reasonable number of spots, even if this value is far off. 00386 The default value (15) is appropriate given the small area and dimness of the 00387 sample data. You will want to increase this number for larger areas of markup 00388 and relatively brighter regions. 00389 00390 If this number is more than 1000, then the algorithm will run very slowly and 00391 may take several days. 00392 00393 </ul> 00394 00395 Note that variables specified on the commandline override all variables in the 00396 configuration file. 00397 00398 The program can read FITS, BMP, PPM and PGM images. Depending on how it 00399 has been compiled, it can also read TIFF, PNG and JPEG images. 00400 The program cannot work on multi-image TIFF files. ImageJ can be used to split a 00401 multi-image TIFF into a collection of single image files. All the images loaded 00402 must be the same size. 00403 00404 \subsubsection ssExtract Extracting and visualising the data 00405 00406 The output file (in this case \c test_data/results.txt ) containing the results 00407 is in a format unsuitable for plotting directly, and must be extracted. The 00408 reason for this is that the output file contains enough information to 00409 seamlessly continue long runs which have been interrupted. The provided AWK 00410 program extracts the coordinates of the spots over all iterations and puts them 00411 in \c coordinates.txt. 00412 00413 Alternatively, the data can ve visualised using the plugin under the menu 00414 <code>Plugins>3B>Open 3B run</code> 00415 00416 \section sconfig The configuration file and advanced settings 00417 00418 The 3B system has a large number of parameters which control its behaviour. 00419 These are controled via the configuration file for the commandlie program or 00420 via the ``Advanced'' option for the plugin. The ``Advanced'' option essentially 00421 allows you to supply a fully custom configuration file. 00422 00423 The sample configuration file is given below along with explanations of all 00424 parameters. In the program, most of these parameters are used by the 00425 FitSpots class. 00426 00427 \include jar/multispot5.cfg 00428 00429 00430 00431 \page sComp Compiling the programs 00432 00433 In order to compile the project, you will need to download and install the 00434 following libraries: 00435 <ul> 00436 <li> TooN http://www.edwardrosten.com/cvd/toon.html 00437 <li> libcvd http://www.edwardrosten.com/cvd/ 00438 <li> gvars3 http://www.edwardrosten.com/cvd/gvars3.html 00439 </ul> 00440 The program is portable and is well tested under Linux and OSX. It will also 00441 compile under Windows using cygwin or MinGW. 00442 00443 The program can be built using the usual method for compiling under Linux: 00444 @code 00445 ./configure && make 00446 @endcode 00447 00448 \section sPlugin Compiling the ImageJ plugin 00449 00450 The plugin is provided pre-compiled from the project website. 00451 00452 There are two ways of building the plugin, manual and automatic. If you want to 00453 make changes to the plugin, then use manual building. If you want to 00454 automatically build the plugin for several platforms, then use automatic 00455 building. 00456 00457 \subsection sManual Manual 00458 00459 The basic build instructions are the same as for the commandline program. 00460 You will also need the JDK (Java Development Kit) and ImageJ installed. 00461 00462 You will have to locate where your system has installed the JDK. If it is 00463 not in /usr/lib/jvm/java-6-openjdk/include, you will have to specify the path: 00464 00465 First configure the system: 00466 @code 00467 ./configure --with-imagej=/path/to/ImageJ/ij.jar --with-jni=/path/to/jdk/include 00468 @endcode 00469 00470 You will also need to make sure that the JDK programs (javac, havah, etc) are in 00471 your path. The configure script will attempt to detect the location of the JNI headers. 00472 If it fails, you will need to specify \c --with-jni=/path/to/jdk/include 00473 00474 00475 To build the JAVA part: 00476 @code 00477 make three_B.jar 00478 @endcode 00479 00480 00481 To build the plugin (on Linux): 00482 @code 00483 make libthreeB_jni.so DYNAMIC_PLUGIN=1 00484 @endcode 00485 Note that if you do not specify \c DYNAMIC_PLUGIN, then the makefile will try to 00486 build a plugin with some dependencies statically linked in which will almost 00487 certainly fail unless you have set the system up to support such an operation. 00488 00489 On MinGW: 00490 @code 00491 make threeB_jni.dll 00492 @endcode 00493 00494 Now copy three_B.jar and libthreeB_jni.so into your ImageJ plugins directory. 00495 00496 00497 \subsection sAutoBuild Automatic Build 00498 00499 The automatic build method is very slow and is designed to be able to repeatably 00500 build plugins for 32 and 64 bit Linux and Windows. It is also designed to build 00501 the plugin with as many static dependencies as possible so that only a single 00502 DLL/so needs to be shipped per system. 00503 00504 The script operates by building a temporary install of Ubuntu 10.04 LTS, and 00505 using that to compile all variants of the plugin. 00506 00507 You will need a Debian based system (or a system on which the command \c 00508 debootstrap works) and root access. 00509 00510 Tha automatic build system makes use of cLAPACK, rather than LAPACK as the 00511 LAPACK part is not speed critical and it is easier to build CLAPACK without 00512 additional external dependencies. 00513 00514 To build, run the following commands: 00515 @code 00516 #First make a tar.gz of the source code 00517 bash make_dist.sh 00518 00519 #Now execute the automatic build process 00520 bash build_plugin.sh 00521 @endcode 00522 00523 The build takes a long time, and you should probably edit \c build_plugin.sh to 00524 point the installer at an Ubuntu mirror somewhere near to where you are. 00525 00526 At the end of the build, the script will print out a directory name like: 00527 @code 00528 dist-123908 00529 @endcode 00530 00531 A fresh copy of the plugin DLL and shared object will be present in that 00532 directory named. 00533 00534 00535 */