Page tree
Skip to end of metadata
Go to start of metadata

Document Properties

identifierSOL-SGS-TN-0006
shortSEGU
latest release1.2
latest release date2017-09-19
status
RELEASED
security classification

ESA UNCLASSIFIED – For Official Use 

owner

Document Comments


Document Releases

  File Modified
PDF File SOL-SGS-TN-0006-SEGU-1.0.pdf Issued version after SOWG #6 Mar 05, 2015 by Luis Sanchez
PDF File SOL-SGS-TN-0006-SEGU-1.2.pdf Issue 1, Revision 2, Date 2017-09-19 Aug 20, 2019 by dgalan
PDF File SOL-SGS-TN-0006-SEGU-0.7.pdf For circulation before SOWG#6 Aug 21, 2019 by dgalan
PDF File SOL-SGS-TN-0006-SEGU-0.5.pdf Aug 21, 2019 by dgalan

Document Source

  File Modified
Microsoft Word Document SOL-SGS-TN-0006-SEGU.docx ** Earlier Versions Without Diagram File Sep 09, 2014 by dgalan
ZIP Archive SOL-SGS-TN-0006-SEGU.zip Remove a reference to figure 1 which was generating changes on saving pdf. Aug 03, 2017 by Richard Carr




8 Comments

  1. Review comments for this document (version 0.3).

    No comments. It can be released for the Design Review.

     

  2. Added version 0.4 (Confluence version 2 of zip) with addition of specification of naming of OVF file to be delivered.

  3. PROPOSED/PENDING CHANGES FOR 1.2

    DONE IN 1.2 Draft 1

    The following two changes were agreed at 2015-12-17 Low Latency Pipeline Engineering Telecon#9

    • Auxiliary data directory inside preceding directory. Is this needed? Proposal: Auxiliary data is always the current.
    • Proposal (further): to delete the auxiliary data folder in cases where it is empty. That is for all instruments except SoloHI and MAG.

    DONE IN 1.2 Draft 1

    Proposed by Emil Kraaikamp by email 2016-02-05 (in response to SOWG splinter draft minutes).

    Comment from Ruben Ibanez by email 2016-02-08 that he see no objection to this option in mounting the input volumes. He considers it unnecessary for the output volumes.

    Regarding this, I talked to Stein and he suggested to make some changes to the NFS mounting options. Specifically adding the option lookupcache=none means that the virtual machine can see changes made to the input directory much faster. Together with a polling frequency of 5 seconds, this seems to indeed work much faster now. Processing is on average started in 2.5 seconds as you would expect. I think it might be good to include this NFS mounting option in the SEGU. 

    DONE IN 1.2 Draft 1 (Include also 2 further keyword to exclude comments on)

    Proposed by Emil Kraaikamp by email 2016-02-05 (in response to SOWG splinter draft minutes).

    Comment from Anik De Groof by email to Richard Carr 2016-02-09 saying that this a Perfect proposal.

    I have some comments on the fits keywords (and filename) for comparing the output
    with the expected fits output: The version part of the filename should be
    ignored (this is the processing time) These keywords need to be ignored: DATE
    and VERSION (already mentioned in the SEGU), FILENAME (see above) ,CHECKSUM (checksum
    is on the entire header part, e.g., also the DATE) The comments of the DATASUM keyword
    also needs to be ignored (as it typically contains the timestamp of when the keyword
    was created). This results in the following fitsdiff parameters;
     --ignore-keywords=DATE,VERSION,FILENAME,CHECKSUM --ignore-comments=DATASUM

    DONE IN 1.2 Draft 1

    Change agreed at 2016-10-05 Low Latency Pipeline Engineering Telecon #13

    Add a (optional) failed directory alongside requests and products, to the Tests tarball delivered as part of a new VM. For code change see SOCA-1540 - Getting issue details... STATUS

    DONE IN 1.2 Draft 1 (retained lists of keywords excluded in SEGU for now)

    Under SOCA-1672 - Getting issue details... STATUS added HISTORY to list of keywords to exclude.

    DONE IN 1.2 Draft 1:  

    Changes agreed at 2016-11-30 Low Latency Pipeline Engineering Telecon #14

    Remove the preceding/auxiliary directory from the structure of a processing request.

    (TBC - actions on TS, LIE) Provide 2 preceding passes rather than 1 as context to METIS and SPICE as they produce lightcurves to be chunked at 1 pseudo-day granularity similarly to in situ instruments.

  4. POST 1.2 Outstanding Issues

    Following an email from Stein on 2016-05-02, discussed with Anik De Groof where the keywords-to-ignore should be listed. Maybe in the ICD rather than the SEGU. Similar argument for CDF. This was discussed further in run-up to  2017-07-10 SOWG #10 Low Latency Pipeline Splinter, and was AGREED in that meeting.

    Jose Marcos noticed lack of an acronyms list (or pointer to) in 1.2.

    Following 2017-10-04 Low Latency Pipeline Engineering Telecon #17 include description of use of 'warnings.txt' file.

    Noticed 2017-11-22 that reference to /data/input/common in bullet in 6.3 on interpreting Figure 2 can be removed. No such directory now appears in the diagram. Also in preceding bullet could now refer to MAG (and Maybe SoloHI) rather than SoloHI and maybe others.

  5. COMMENTS ON 1.2 draft 1

    DONE IN 1.2 proposed ISSUE created 2017-05-12.

    By email from Emil Kraaikamp 2017-01-26

    In 6.2.2 you mention that the input directory should be mounted read only (which we do), but then when you provide the mount options for the input volume you mention specifically to use the same options as for the output volume with only additional option lookupcache=none, instead of mentioning here the read only option 'ro' as well; so it becomes: "use the same, but add the following additional options: ro, lookupcache=none".

  6. Post 1.2 Proposed change to location for 'failed' indicator and output.

    As described in email from Richard Carr to Sol_LL_Pipelines_Eng@sciops.esa.int 0n 2018-06-08.

    The change was widely accepted. The intention is to discuss and finalise it in the 2018-07-09 SOWG#12 Low Latency Pipeline Splinter

    The text of the email follows.

     

    Dear All,

    A while ago I modified the SEGU to include a specific location where VMs should output the results of failed processing.

    This system includes a 'failed' directory, parallel to 'products' (a sibling) inside the top-level instrument output directory.

    The SEGU says that output for a specific request will appear EITHER in 'products' OR in 'failed'. My understanding was clearly that a request either completed correctly or it did not - there was not intermediate state.

    In a recent telecon it became clear that at least some instruments intended to output both usable products and failed products. That is, I understand, that failure can be a per-product-file rather than per-processing-request result.

    It is not convenient that output for a single request is created in both products and failed. One reason for this is that the atomic (= instanteneous) delivery of the results to the final destination is no longer atomic.

    So I propose:

    1. No longer to have a 'failed' directory at the top level.

    2. Instead, all processing results are (finally) placed in 'products'.

    3. All products which are successfully created are placed directly inside the specific results directory (e.g. request_XYZ), just as now.

    4. In the case (and ONLY in that case) that there is a failure in the creation of some product (or all products), a directory 'failed' is created by the virtual machine INSIDE the specific results directory.

    5. The VM may (or may not) place any failed partial product files inside request_XYZ/failed. In the case that no meaningful product files at all are created then the output would be just a product directory e.g. request_XYZ containing nothing but

    I attach a modified version of figure 1 from section 6.2.12 to illustrate the modification to the tree structure which I propose (note, no changes at all, of course, in the input tree).

    I would like to ask that you respond ASAP, particularly of course if you see a problem of have an objection.

    I am trying now to make our systems work with tests which are expected to fail (I note that none of the instruments has provided such tests yet), and this work will depend on this data architecture.

    Regards,

    Richard

  7. Further Post-1.2 proposed change:

    Group ID should be the same as user ID used in creating files. That is it should be 1050. See Users and Groups on Low Latency Pipeline Hosting machines and LLPH-103.


    Additionally re the permissions set for output directories and files by LLVMs. As described in  2019-04-10 Low Latency Engineering Telecon #22 RC proposed drwxrwxr-x for directories and -rw-rw-r-- for files. This was accepted (pending approval  of draft minutes).


  8. Further Post-1.2 proposed change:

    From a draft for the minutes of the LL Telecon 2020-01-22 :

    For context RC first mentioned that recently it became clear that some instrument LLVMs did not react to processing requests because they were filtering for requests which met a naming scheme which was different (in the length of the timestamp string) from that specified in the SEGU, 1.2 section 6.2.4. However their test requests provided with the VM deliveries also followed this variant naming, so were processed correctly. However the SOC produced requests were not processed.

    RC then indicated that he had discovered that that the example request names in the data tree, figure 1 in the Data Tree given in section 6.2.12 in SEGU 1.2 actually followed this incorrect pattern. It seems likely that this is the origin of the problem.

    This has been taken note of for any next SEGU version.