Ocean Color Forum - Not logged in
Forum Ocean Color Home Help Search Login
Previous Next Up Topic Ocean Color Features Discussion / Inherent Optical Properties Workshop / Concerns about workshop goals (2014 hits)
- By stephane Date 2007-05-21 04:05
I have a few comments about the workshop announcement and your below                     
message.                                                                                 
                                                                                         
As you and I discussed a bit a couple of weeks ago, I'm not very                         
comfortable with some of the wording used to describe the goals of                       
the workshop. For example, "... long-term goal is achieving community                    
consensus on the most effective algorithmic approach for producing                       
global scale, remotely-sensed IOP data products". This is a pretty                       
vague statement. What do you mean by "the most effective algorithmic                     
approach" ? What defines effectiveness ? Speed ? Accuracy ?                              
Precision ? All of the above ? Is an algorithm that gives the perfect                    
answer in 5 days better than an algorithm that gives a "not so bad"                      
answer in a split second ? Also, how do you build "consensus" ? What                     
are the metrics ? The list of questions to answer is pretty long so I                    
think we should try to be clear about all this from the get-go. I                        
think the metrics/rules that will be used to evaluate models will                        
need to be clearly set early in the process. I'm just trying to avoid                    
some of the headaches we had in SeaBAM, OCBAM and the IOCCG IOP                          
working group...                                                                         
                                                                                         
Also, you guys are planning on recoding each of the algorithms/models                    
so they can be run in msl12. That's fine but in doing so you have to                     
ensure that the msl12 version reproduces exactly the numbers returned                    
by the original version of the model. If when recoding things you                        
guys need to change a few things here and there (e.g. a minimization
method) you may very well end up with an implementation that does not                    
accurately reproduce what the original code produces. We've seen this                    
during the OCBAM workshop and as another example, the GSM algorithm                      
implemented in SeaDAS does not produce the exact same numbers as my                      
original code (it's generally very close but there are instances                         
where substantial differences exist). This is also true for our own                      
versions of the GSM model in IDL and matlab (close but not exactly                       
the same). As you stated, model providers will need to do some                           
verification of this but what if the msl12 and original codes do not                     
agree ?                                                                                  
                                                                                         
These are my 2 cents (for now). Sorry to bring the headaches on so                       
soon....                                                                                 
                                                                                         
Best,                                                                                    
                                                                                         
Stéphane              
Parent By @jeremy Date 2007-05-21 10:45
Hi Stephane (and others),                                                                

Please, feel free to complain -- the more dialog the better as far as I'm                
concerned.  Not comfortable when the group gets too quiet ;-)  Let me expand              
on our "agency" goals a bit, which will perhaps clear up our perspective.                

In our perfect world, we'd have a consolidated IOP algorithm to be put into              
operational satellite processing, accompanied by a rigorous uncertainty                  
budget for this algorithm.  Currently, a number of approaches exist, but                 
when you "tear them apart", most are very similar -- differing only by                   
basis vector parameterization (S, aph*, etc.) or inversion technique.  We                
have no interest in an algorithm "shoot-out" -- rather, we'd like to get                 
a group together (yourselves) to study these parameterizations/inversions                
to determine their sensitivities in an operational satellite processing                  
environment.  We know each algorithm has been verified using in situ                     
data (specific manuscripts and the IOCCG report) and some satellite data --              
but to our knowledge, most have yet to be rigorously vetted in the                       
satellite environment (e.g., how/where does the alg perform in l2 space                  
versus l3 space, does spectral resolution matter (we lose a green band in                
VIIRS), what happens when input Lwn are imperfect?)  -- nor, has satellite               
inversion failure remediation been robustly explored.  In other words, we'd              
like to better understand why some algs "blow up" when globally applied, and             
to determine if/how such events can be avoided.  All of you have worked                  
on this in some capacity, we just want to "get the band back together".                  

With regards to implementation in msl12 -- well, we need the algs to work                
in this processing environment -- otherwise, no operational satellite                    
products from NASA, right?  Our goal is to ensure that msl12 faithfully                  
replicates the original implementation, however, differences will arise                  
(we'd like to know where/why) and we'll have to iterate with each PI.                    
Ultimately, we hope to build a version of msl12 with a generic IOP                       
algorithm that allows command-line parameterization of S, aph*, etc. and the             
inversion method.  Not there yet, but it's coming.  This will take some                  
work and much thought as to implementation.  As you mention, changing the                
inversion (for example) changes the answer -- which is another reason why                
we're asking the questions we're asking (why?  do we have strong                         
recommendations as a community? etc.)                                                    

I know I didn't answer all of your questions/concerns.  We know we'll need               
a series of statistical metrics to evaluate the "infinite" series of                     
approaches, but this is a work in progress and suggestions are welcome.                  
Again, we don't want to foster an environment of algorithm competition --                
one person's approach versus another's -- but, rather work towards                       
better understanding of how/where everything works (or doesn't) when applied to          
satellite Lwn.                                                                           
                                           
Keep the dialog coming!
Parent By tsm Date 2007-05-21 11:05
Greetings all,                                                                           
                                                                                         
One thing that was lost at the OCBAM meeting a few years ago, in my opinion,             
was the fact that the semi-analytic algorithms under consideration at the                
time were evaluated by comparing the 3 basic output variables - Chl, ag0,                
and bbp0 - to the measured in situ values.   However, other output variables             
exist with these algorithms - namely, the other IOP results such as aph.                 
How do you weight the evaluation of multiple output fields produced from a               
single algorithm, and is it useful to look at the full spectral distribution             
of these fields?   As a hypothetical example, is it 'bettter' to minimize                
error in the chlorophyll product over the bbp or some other IOP product?   I             
really don't know...                                                                     
                                                                                         
This gets even more complicated when you start to evaluate the sensitivities             
of these fields within the satellite data at the different levels.                       
                                                                                         
I'm just thinking out loud...                                                            
                                                                                         
Tim                      
Parent By zplee Date 2007-05-21 11:25
Hi All:                                                                                  
                                                                                         
Apparently Stephane started this discussion/conversation ahead of the                    
October scheduel, that is great!                                                         
                                                                                         
To add to the IOP dilemma, another issue is, unlike Chl, there is wavelength             
variation. An algorithm may perform well at one wavelength, but does not                 
mean the same for other wavelengths. So, in my mind, the first thing is to               
define the "standard" IOP products, mean IOPs at what wavelengths                        
will/should be produced. From there, to determine different 'operational'                
algorithms  for different products. If one is good for total a, then uses                
that for total a; if one is good for aph, then uses that for aph. It is not              
necessary to have one algorithm for all products. The best (and eventually               
it will be) is to have a package (mix/match of existing and new ones to form             
the 'best' package) for the overall IOP-products objective.                              
                                                                                         
That is my two cents ...                                                                 
                                                                                         
Cheers!                                                                                  
zhongping                           
Parent By EmmanuelBoss Date 2007-05-21 12:06
Dear all,                                                                                
                                                                                         
Here are my 2c'.                                                                         
                                                                                    
Most differences between current IOP algorithm are cosmetic. The fundamental             
approach in all is very similar:                                                         
                                                                                         
1.       Define a relationship between Rrs and IOPs (explicit, implicit, 2               
vs. 1 term).                                                                             
                                                                                         
2.       Choose spectral shape for IOP (may use Rrs in input as in QAA or                
not).                                                                                    
                                                                                         
3.       Invert to obtain best fit (linear, nonlinear, Look-Up-Table).                   
                                                                        
The subtle differences which we should discuss are whether certain choices               
among these algorithms (e.g. in brackets above) are clearly superior to                  
others in a broad category of tests including:                                           
                                                                                         
1.       Matchup with datasets not used in tuning.                                       
                                                                                         
2.       Computing speed.                                                                
                                                                                         
3.       Generation of uncertainties.                                                    
                                                                                         
4.       Ability to work in complex waters.                                              
                                                                                         
5.       Dealing with inelastic scattering.                                              
                                                                                         
among others.        
   
I don't think we should have a pissing contest among existing algorithm but              
rather evaluate what are the advantages of each in order to suggest a set of             
recommendation for yet-to-be invented more optimal algorithms.                                                                                       
                                                                                         
Don't forget that the MEASURED IOPs against which we test the algorithms                 
themselves have their own uncertainties:                                                 
                                                                                         
For a_phi from filter-pads - the only method for scattering correction is a              
constant offset removal possibly resulting in an overestimate in the blue                
wavelength. For b_bp - we use a smooth curve even though we know that in                 
reality it is not smooth when algae are present. I will leave a_cdm problems             
alone for now.                                                                                                                                 
                                                                                         
Cheers,                                                                                  
                                                                                         
   Emmanuel                                 
Parent By paul.lyon Date 2007-05-21 12:15
Jeremy, and others:                                                                      
                                                                                         
I understand Stephane's concerns.  I appreciate what you are trying to                   
do also.  One of the issues I think also exists is that there are trade                  
offs for each technique.  An algorithm that is designed to work well                     
globally can be appropriate for large scale IOP studies, however, a                      
regionally tuned algorithm may be more appropriate for in depth small                    
scale process studies.  Also, the number of bands used and parameters                    
retrieved should be driven by the goals of the study at hand (someone                    
interested in total absorption doesn't need an algorithm that retrieves                  
all the different phytoplankton pigment groups).                                         
                                                                                         
I think it will be very difficult to find a on size best fits all                        
algorithm to have as a default and on the other side, very difficult to                  
code up all the options that can be used...  I have 4 major versions of                  
just the linear matrix inversion method that I use that are coded in                     
ways that are best suited for different tasks.  Remember the issues                      
users had with the multitude of Chl algorithms available on MODIS?  It                   
will be very difficult to include all the methods for that reason alone.                 
                                                                                         
I do think it is good to get together and have a discussion of the                       
strengths and weaknesses of the different methods.  The IOCCG report                     
number 5 did a pretty good job of describing several algorithms.  Maybe                  
as a group we can expand on the information published there.                             
                                                                                         
                                                                                         
Paul
Parent By tjsm Date 2008-05-22 08:02
... and now to continue the conversation on the forum ...

Following on from Ping's comment yesterday:

"So, in my mind, the first thing is to define the "standard" IOP products, mean IOPs at what wavelengths will/should be produced. From there, to determine different 'operational' algorithms  for different products. If one is good for total a, then uses that for total a; if one is good for aph, then uses that for aph. It is not necessary to have one algorithm for all products. The best (and eventually it will be) is to have a package (mix/match of existing and new ones to form the 'best' package) for the overall IOP-products objective."

I want to agree with this in a pragmatic sense - but intellectually I am not so sure.  A number of the semi-analytical algorithms rely on first determining the primary IOPs (i.e. total a and total bb) and then partitioning them according to empirical field or laboratory data (or sweeping generalisations and extrapolations).  Theoretically, this type of algorithm could reproduce excellent retrievals of aph and ady - but poor retrievals of total a and bb.  It may give us the right answer in the pragmatic sense but it will not lead to any further understanding as to why it does.  Following on from Malik's excellent "ambiguity" paper last year perhaps better understanding is worth more than pragmatism?  By keeping algorithms intact it also allows us to identify sources of error and propagate them through the equations better (perhaps).

Following on from Emmanuel's point about in-situ measurements having error: this is something we really must be mindful of.  Looking at V1 of the NOMAD database I am sure that the bb retrievals have been made at a single wavelength and then extrapolated to the other wavelengths.  Measurements aren't always measurements - there is always a degree of modelling in them, especially when using instruments such as the ac9 / bb6 etc.

That is my tuppence worth (we don't have cents here!)

Tim
Parent By EmmanuelBoss Date 2008-05-22 16:21
Tim,
the issue of ambiguity in Rrs inversion was also treated by:

Uniqueness in Remote Sensing of the Inherent Optical Properties of Ocean Water
Michael Sydor, Richard W. Gould, Robert A. Arnone, Vladimir I. Haltrin, and Wesley Goode
Applied Optics, 2004, Vol. 43, Issue 10, pp. 2156-2162 

Worth the read.

All the best,
    Emmanuel
Parent By stephane Date 2008-05-22 16:51
Tim is right about the bb data in NOMAD v1. I think a fit to the in situ data was used rather than the actual data. I think this is documented somewhere. Jeremy, do you confirm ? Will you use the same approach for NOMAD v2 ?

Stéphane
Parent By @jeremy Date 2008-05-22 17:09
Confirmed!  The main NOMAD v1 dataset presents bb at 20 wavelengths.  At the time, most field instruments were only reporting up to 6 channels.  As such, the field data were used to "extrapolate" to the other wavelengths.  For those who are interested, the original (measured) bb data and a document describing the IOP data preparation and QC are available online via the main NOMAD Web page under "IOP Processing Documentation".  The document describes how the bb slopes / extrapolation / etc. were executed.  Specifically, check out:

http://seabass.gsfc.nasa.gov/data/werdell_nomad_iop_qc.pdf
http://seabass.gsfc.nasa.gov/data/nomad_bbeval_v1.3_2005262.txt

Ultimately, both the fit and measured data are available in NOMAD, but (as you mention) only the fit data are provided in the main data file.  I plan to do the same for v2, unless consensus suggests otherwise.
Parent By tsm Date 2008-05-22 17:20
Hi,                                                                                      
A thought on this...aggregating separate IOP's from different algorithms to              
represent a 'best of' may not be entirely compatible within the whole                    
AOP-IOP system.   That is, IOPs from different algorithms mixed and combined             
could likely lead to AOPs being different from the original Rrs spectra when             
using the IOPs in a forward model.   I think this should be tested out, but              
we've seen here at UNH that combining IOPs from different models 'can' lead              
to some pretty weird relationships...                                                    
                                                                                         
American Tim    
Parent By Hirata Date 2008-05-23 09:26
Dear all,

I think that it is better for us to have bb data (actually any data) as accurate as possible, especially when the data are used to evaluate IOP algorithms. Also we may want to remember that most (or all?) bb data in NOMAD are taken by a commercial instrument(s) which actually "estimates" bb from scatterance at one or a few angles by using a model. In any instrument, there is almost always a certain "model" wthin the instrument to produce "measurement", just like satellite ocean colour we are discussing now).  If we fit the "estimated bb" over a certain spectral range as in NOMAD v1, we are getting further and further away from  "true" bb (the fitting process does not necessarily pick up even the original measements at some wavelengths, or does it?). As far as original measurements are available, I think that the original measurements should be used. Emmanuel also warned that the bb spectra is not always smooth as a commmon model shows, probably due to effects of absorbing particles.  Although using the original data may reduce a number of the data matched-up with other IOPs & AOPs (in terms of wavelength),  we may want to admit it just as a fact?

Another idea is that we may use "both" the fitted and measured data in IOP model comparison. But in this case, we need to clearly state which results are obtained from which "data" (i.e. fitted or measured). In this way, we can evaluate a "potential" of an IOP model which may be used to estimate bb at wavelengths not supported in validation data (= measurements). My point here is that the use of both type of data allows us to separate the "potential" from "actual performance" of an IOP model.

In any case, we might need original measurements?

Takafumi
Parent By @jeremy Date 2008-05-23 12:50
Good morning (on the US east coast, anyway), all.

Couple of thoughts for the conversation regarding field data.  Both the measured and fit bb (and ad and ag, for that matter) are available via the NOMAD web site -- as is a document that describes how the measured data were fit.  I'd suggest that a few of you take a look at these data (as time permits) to evaluate their utility.  Believe me, you don't have to convince us that some data have quality issues.  But, they were rigorously screened and, hopefully, most errors are systematic.  Ultimately, we'd *love* to have more eyes on the product, so this exercise would be really useful from our perspective.  I'd be interested in hearing more about your thoughts on fit versus measured data for this activity.

FYI, after fitting, the modeled and measured data were visually inspected in tandem to confirm coherence.  In general, almost everything in NOMAD was visually inspected -- one reason why it takes so long to generate the data set.  Doesn't mean the data are perfect, just means we've aimed for consistency and searched for obvious errors.   

I'm glad we're having this discussion, as the quality of the field data -- and the general lack of assigned uncertainties for these products -- is very important (for the workshop and otherwise).  Keep in mind, however, that statistical comparisons with field data may only emerge as a minor metric for our workshop analyses.  First, this has been done thoroughly in the past.  Second, personally, I'm more interested in using these data to better understand algorithm sensitivities -- that is, how do the comparisons change (relative to themselves) as input parameters (Rrs) or constants (aph*, S, etc.) are modified.   Finally, don't forget that we'll be using satellite data (match-ups, level-2 and level-3 files) as well.  All I'm saying is that "all of our eggs aren't in the in situ data basket" --- which, hopefully, provides everyone with some comfort.  Now, I know the satellite Lwn aren't perfect either -- but, to make any progress towards our specific goals, we're going to have to take a leap of faith that they're good enough for the Workshop (as a starting point, anyway).  I don't envision having sufficient time to delve deeply into atmos. correction and normalization problems -- all good topics and fodder for a follow up activity.

Happy Friday.

  J 
Previous Next Up Topic Ocean Color Features Discussion / Inherent Optical Properties Workshop / Concerns about workshop goals (2014 hits)



Responsible NASA Official: Gene C. Feldman
Webmaster: Norman A. Kuring
Authorized by: Gene C. Feldman
Updated: 27 November 2007
Privacy Policy and Important Notices NASA logo