Reply on RC1

The structure could be improved. The motivation of the study does not really become clear. The aim/objective provided is too vague or appears too late in the Abstract (l. 29/30). A17: Both of the reviewers commented on the abstract, so we will take some time to revise it. The aims/goals must be clear and obvious, and we will revise the structure to make that the case. l. 14/15: What is exactly the research gap that should be highlighted here? (Bad agreement between measured and modelled data, missing processes, .....?) A18: Inappropriate input data, simplified process descriptions and imprecise model parameters result in poor prediction of N2 and N2O fluxes, while the scarce availability of reliably measured N2 and N2O soil flux data make it difficult to properly validate denitrification models. We suggest that the accuracy of models could be improved using appropriate, recent datasets, and we use two new datasets to test the model outputs from 3 models and identify the most urgentlyneeded areas for improvement. l. 16: “to test the denitrification sub-modules of existing biogeochemical models” is too unspecific. It has to be mentioned with regard to what the sub-modules were tested. The information provided in l. 22 should be included. A19: We will modify the text as suggested: “The denitrification and decomposition sub-modules of three common biogeochemical models (Coup, DNDC and DeNi) were tested using the data. We use measured data from two laboratory incubations to test the model response on NO3, soil water content and temperature manipulations on denitrification sub-modules of existing biogeochemical models. No systematic calibration of the model parameters was conducted since our intention was to evaluate the general model structure or ‘default’ model runs.” l. 20/21 “Three common...” and first part of l. 21/22 “No systematic....”: This information should be included in l. 16. A20: Yes, see the previous answer. l. 28/29: uncertainties: More specific information is needed here, for instance, one or two examples or a short explanation. A21: We will modify the text as follows: Differences between the measured and modelled values can be traced back to model structure and/or parameter uncertainty (i.e. modelled microbial growth affected the timing of N2 and N2O fluxes, while modelled hydrology affected whether anaerobic conditions were present for complete denitrification). l. 29: This information must be provided earlier. A22: Yes, as shown above, we will specify the aim of the paper in Line 14/15.

Treatment numbers (silt-loam soil): In the Results and Discussion sections (including Table 7) treatment numbers are provided with subscripted information including waterfilled pore space, bulk density and N addition. Theses superscripts should be left out, since they worsen the readability of the text. All important information about the treatments can be found in Table 2. A3: We will remove the subscripts from the text and the tables as suggested by the reviewer. The text will be modified to improve readability while still making the various influences on fluxes clear. We will also include citations to Table 2 to remind readers where to find additional information. Inconsistencies: (i) Test criteria: According to l. 171/172, "comparing the magnitude of measured and modeled fluxes was not a criterion" for model evaluation. However, in l. 235/236 it is written, that this comparison "was considered a secondary criterion." This should be clarified.
A4: Yes, the magnitude of measured and modelled fluxes was, in fact, considered, although it was not our primary focus. We will clarify the text accordingly.
(ii) Tables 1 and 3: For the silt-loam soil, the sum of   Table 1 corresponds to the respective sum in Table 3 (14 mg N kg-1). This is not the case for the sand soil (19.2 vs 16 mg N kg-1). Do the numbers in Table 3 refer to pre-incubated soil material? If so, it is strange that the N concentrations of the silt-loam do not differ. This should also be clarified. Table 1 shows values measured from field-fresh soil,  while Table 3 shows values prior to pre-incubation, after the soil had been stored for some time. As mineral N is not a basic soil property but instead subject to change, we have removed the values from Table 1 and just keep those from  Table 3, which more accurately reflect conditions at the beginning of the incubations.

A5: This is a good point -
(iii) Figures 3 and 7: In Figure 3d the measured CO2 fluxes of the four individual control cores of the sand soil are shown. Around day 10, a maximum respiration rate of about 0.4 g C m-2 d-1 is depicted for three of the cores, the fourth one had a lower respiration rate. However, in Figure 7 an average CO2 emission rate of almost 0.8 g C m-2 d-1 is shown. The peak around day 5 ( Figure 7) seems not to be correct either. This is an important point (comparison of measured and modelled data) and should be carefully checked. (iv) Discussion: The summary of the results does partly not agree with the data shown in the respective figures and tables or is misleadingly formulated (l. 578-582; see detailed comment below).

A7: Please, see response below.
Tables and figures: As outlined above, the huge number of tables and figures makes it sometimes difficult to follow the text. I strongly recommend to reduce the number of figures and tables and only keep those which are really necessary for the aims of the study (not all data generated must be shown).
A8: We will, as requested remove some tables and figures. See more details below.
(i) It is obvious that not all tables provided as supplementary material are necessary, since there is no reference to some of these tables.. All tables to which is not referred to in the text must be omitted (Tables S.3, S.4, S.5).
A9: Although we agree that we need to reduce the number of the figures and tables, some of the data we included are for use in further modelling studies and therefore, also tables and data that were not cited may be worth including. However, it is a good point that we need to make the readers aware that the information is available, so we will include a line in the results that summarizes what additional information is included in the supplementary material and then cite those figures and tables that are otherwise not mentioned.
Specific changes we will make: We will merge Table 2 and Table 3.
We will merge Table 5 and Table 6. (ii) The Tables 2 and 3 could be merged, since the only additional information in Table 3 is "Calculated 15N enrichment". The initial mineral N concentrations are already shown in Table 1 and "Added N" in Table 2. A10: We will remove Table 3 and the information about Nmin and the "Calculated 15N enrichment" will be added to Table 2 or the description of the experiments.
(iii) Since one focus of the manuscript is on the comparison of measured and modelled data, the measured data need not to be presented in that detail as in the present manuscript. Therefore, I recommend omitting Figures 2 and 3, because all information needed for the comparison of measured and modelled data are included in Figure 7 (average N2O+N2 and CO2 emission rates). (iv) Figures 4, 6, and 7: The sub-figures (d) to (f) should be (a) to (c), since respiration rates are described first in the text, i.e. before the N2+N2O data. The order of the figures/tables should correspond to the order in which they are cited in the text.    Table  S.5, the order is changed.
A15: This is an excellent point. We changed the order of the results in the figures and the tables as suggested.

Specific comments
Title: The title must be revised. The title focuses on denitrification, but CO2 which is not an end-product of denitrification is also mentioned; "decomposition" should be included. "Denitrification" alone is too unspecific. It should be clearly mentioned what was evaluated (e.g. denitrification products, temporal variations).
A16: We will revise the title of the article as requested.

Evaluation of temporal variation of denitrification products and decomposition from three biogeochemical models using laboratory measurements of N 2 , N 2 O and CO 2
Abstract: The structure could be improved. The motivation of the study does not really become clear. The aim/objective provided is too vague or appears too late in the Abstract (l. 29/30).
A17: Both of the reviewers commented on the abstract, so we will take some time to revise it. The aims/goals must be clear and obvious, and we will revise the structure to make that the case.
l. 14/15: What is exactly the research gap that should be highlighted here? (Bad agreement between measured and modelled data, missing processes, …..?) A18: Inappropriate input data, simplified process descriptions and imprecise model parameters result in poor prediction of N 2 and N 2 O fluxes, while the scarce availability of reliably measured N 2 and N 2 O soil flux data make it difficult to properly validate denitrification models. We suggest that the accuracy of models could be improved using appropriate, recent datasets, and we use two new datasets to test the model outputs from 3 models and identify the most urgentlyneeded areas for improvement.
l. 16: "to test the denitrification sub-modules of existing biogeochemical models" is too unspecific. It has to be mentioned with regard to what the sub-modules were tested. The information provided in l. 22 should be included.
A19: We will modify the text as suggested: "The denitrification and decomposition sub-modules of three common biogeochemical models (Coup, DNDC and DeNi) were tested using the data. We use measured data from two laboratory incubations to test the model response on NO 3 -, soil water content and temperature manipulations on denitrification sub-modules of existing biogeochemical models. No systematic calibration of the model parameters was conducted since our intention was to evaluate the general model structure or 'default' model runs." l. 20/21 "Three common…" and first part of l. 21/22 "No systematic….": This information should be included in l. 16. A20: Yes, see the previous answer.
l. 28/29: uncertainties: More specific information is needed here, for instance, one or two examples or a short explanation. A21: We will modify the text as follows: Differences between the measured and modelled values can be traced back to model structure and/or parameter uncertainty (i.e. modelled microbial growth affected the timing of N 2 and N 2 O fluxes, while modelled hydrology affected whether anaerobic conditions were present for complete denitrification).
l. 29: This information must be provided earlier.
A22: Yes, as shown above, we will specify the aim of the paper in Line 14/15.

Introduction:
Some additional background information is needed (see comment on l. 68-72). l. 37: "nitrogen" must be replaced by "N" A23: We will change the text as suggested.
l. 42: The colon should be left out.
A24: We will modify the text as suggested.
l. 44 and l. 47: The information in brackets should be provided as subclauses ("which is a function…", "i.e. high background….") A25: We will follow this suggestion.
l. 55: "input data may result" instead of "input data result" A26: We will modify the text as suggested.
l. 57/58: The references cited here are the descriptions of the models used in the present study. Do they really demonstrate that "measurements of both N2O and N2 fluxes…. are necessary to develop and test algorithms"? Del Grosso et al. (2000) is missing in the reference list.
A27: Thank-you for pointing out the missing reference.
We agree that these references don't necessarily demonstrate that both N 2 O and N 2 fluxes are necessary to develop and test algorithms. To be more precise, we will use "…both N 2 O and N 2 fluxes are necessary to develop and/or test algorithms" We l. 68-72: This passage needs to be revised. It must be clearly shown that the models are not able to properly predict denitrification processes and the dynamics/fluxes of the endproducts (research gap must be comprehensibly identified). Appropriate references must be cited. At the moment, it is only mentioned that the models were used with "success". The fact that the use of the acetylene inhibition technique may lead to incorrect results does not prove that the denitrification sub-modules of biogeochemical models provide incorrect predictions. l. 76-86: The first sentence (l-76/77) should be left out. Instead, "specifically" in l. 83 should be deleted and the aims presented in l. 83-86 should be moved to the beginning of the paragraph (laboratory incubations are considered in the aims).

Materials and methods:
Some important information is missing.
l. 95: The soil classification system used (World Reference Base for Soil Resources) should be added and "organic matter" should be replaced by "organic carbon" A33: We will add info that the soil classification system used was World Reference Base for Soil Resources and change to organic carbon as requested.
l. 97 and l. 110: More information is needed about how the soil samples were obtained. Using an auger, steel rings, a spade? A34: We will add the information that spades and shovels were used to collect the soil.
l. 97 and 111: Why were the soil samples sieved to 10 mm and not 2 mm? Soil chemical analysis are usually conducted on the fine-earth fraction (< 2 mm). No additional sieving for soil chemical analysis is mentioned. A35: We will include the following information: 10 mm sieving using a high capacity rotary sieve was part of the procedure to prepare the approximately 2 tons of soil for this extensive joint research project, which provided the base soil for several groups. The use of 2 mm sieving prior to mineral N analysis was omitted to allow unbiased quantification of these N species. The number of decimal places should be consistent within the table, C/N ratios should be provided without decimal place. The unit of bulk density should be written as "g cm-3"; "CaCl2" should be placed immediately after "pH" and not in the same line as the units of the other soil properties.
A36: We will modify Table 1 as suggested l. 111: Were the NO3 and NH4 concentrations also determined using air-dried soil material? Nitrate and NH4 should be extracted as soon as possible after sampling, airdrying of the soil samples is not recommended. This may lead to erroneous results. A37: The original mineral N extraction was done with field-fresh material, but of course, as mentioned above, mineral N is subject to change, so we have removed these values from Table 1 and just keep those from Table 3, which more accurately reflect conditions at the beginning of the incubations.
l. 117: How many replicates per treatment were used? A38: We had three replicates per treatment. We will add the missing information to the description. l. 118: Delete "then" before "added" l. 121: How was "water content kept constant"? A41: To clarify the method we will reformulate the description: "During the incubation, only temperature was changed (Fig. S.3), while the initial settings of water content were not changed and loss of soil water by evaporation was minimized because the mesocosms were kept closed." l. 123-126 and l. 161/162: All information needed to understand and also interpret the results of the present study needs to be provided (even if they are already published), since some readers may not have the possibility to access the relevant article(s). Therefore, for all technical devices used, information about the model, the manufacturer and the location of the manufacturer's headquarter should be added. The basic principle of the methods to determine NO3, NH4, etc. used should also be included; pH: Which solvent was used?
A42: We will add all missing information and method descriptions to the text. CaCl 2 solvent was used for the pH measurements.
l. 125: "at the beginning": Were the analyses conducted using pre-incubated soil material or the "original" soil material? A43: To clarify the method we will reformulate the description: "Soil samples were collected after pre-incubation immediately before packing of the mesocosm as well as at the end of the incubation." Table 2: The number of decimal places should be consistent within the table (bulk density, WFPS). The units of bulk density and water content are missing. All units must be written with squared brackets. The abbreviation "WFPS" has to be defined. What does the asterisks in the last columns mean? A44: We will modify Table 2 as suggested and define the abbreviation previously in the text. The asterisks will be deleted. Table 3: Tables 2 and 3 should be combined (see general comment on "Tables and Figures" above) and Table 3 left out. Table 2 and Table 3 as suggested l. 140: "C-to-N" should be replaced by "C/N" (cf. Table 1) and "nitrogen" by "N" A46: We will follow the suggestion of the reviewer.

A45: We agree and we will merge
l. 149: At which days/time steps were gas samples collected manually?
A47: The samples were collected every third day. We will add this information to the description. A48: We will change the format as suggested.
l. 174-176: This is discussion material. A49: The sentence will be moved to the discussion. l. 189: I guess, NH4 would be correct (instead of NH3).
A50: Yes, it should have been NH 4 + . We will change it accordingly.
l. 207: Delete the colon A51: We will delete the colon.
l. 215-217: This information would better fit in the Introduction.

A52: Good point, we will move this information to the introduction.
l. 219: Table S.6 should be Table S.1, etc.
A53: We will modify the order of the Tables in the supplementary material as suggested l. 259 "treatments": There is only one treatment (ryegrass addition) and one control. "Treatments" should be replaced by "soil cores". A54: We replaced the text as suggested: "For the sand soil cores with application of ryegrass, the C and N of ryegrass were exclusively added to the labile pool." l. 264: Which model parameters and settings were modified and how? More specific information must be provided.
A55: We provided more information in Table S.7 in the supplementary material. We will modify the text of the article accordingly.
l. 272-279: A clearer separation between silt-loam and sand soil is needed here. A56: We agree and will reformulate: "for the silt-loam soil we ran the model calculated with one soil layer because water content was assumed homogenous. For the sand soil, however, five, 2 cm thick soil layers with differing water content were simulated because significant differences in water content were evident/expected. l. 301: How was normality checked?    A62: We measured 3 replicates but the figure with error bars would not be interpretable anymore. We will add the standard deviation of cumulative fluxes in Table 5.

manipulation events occurred."
l. 340 "fluctuations in the CO2 fluxes": Is this information really needed here? A64: We agree that this information is unnecessary, and we will delete this.
l. 344: The reference to Table S l. 344/345 "Initially,…": This information is redundant and therefore not needed here.
A66: To clarify temporal dynamics, we will reformulate this sentence "N 2 +N 2 O fluxes were initially high in both treatments (Figs. 2a and 3a) but decreased rapidly following the drainage period during the first 12 days of incubation (see Table S1and  A67: We agree with you, that we need to reformulate this sentence. Previously, we used the date form and not the days and these dates still remained in the text. We will change them. We will also revise the text "N 2 +N 2 O: core 1 and 2 and limited core 3" to agree with the Figure 2a. l. 375: "d-f" should be added after " Fig. 4" A68: We will add the text as suggested l. 376-379: Where is this shown? Table 8? A69: These data were shown in Table 8. We revised the text: "On average, DeNi calculated ~4 times higher N 2 +N 2 O fluxes than measured. In contrast to this, N 2 +N 2 O fluxes obtained from Coup were about 4 times lower than the measured values, despite the fact that the N 2 O estimation of Coup was quite close to the measured values (Table 8.). In DNDC, it is notable that N 2 fluxes were always zero and it therefore underestimated N 2 +N 2 O fluxes even more (~30 times) than Coup (Table 8.)." l. 380: "and DNDC" should be added after "Coup". The reference to Figure S.6 should be moved to the end of the sentence. A70: We will modify the text as suggested l. 391: "smaller" must be corrected to "higher" A71: Agreed, we will change the text "smaller" to higher l. 396-406: This paragraph needs to be revised. It is absolutely inconsistent. According to l. 396, it deals with "cumulative N2+N2O fluxes". For Coup and DeNi, cumulative fluxes are described. However, for DNDC, there is a reference to Table 8 in which average fluxes are shown and l. 404-406 deal with Table 7 instead of describing the results shown in Figure 5 (cumulative fluxes). The explanation for the numbers in Table 7 is in the next paragraph (l. 407-413).
A72: We agree with the reviewer that this paragraph should be better structured. We will add a description of the DNDC (based on Fig. 5) and update the results based on the suggestions of Reviewer II. We will move sentences 404-406 to the next paragraph.
l. 431-434: See general comment on "Inconsistencies" above. The measured data shown may be partly incorrect.
A73: Yes, as we commented above, the uploaded version of Figure 7 here was an old version, which we will replace with the up-to-date version.
l. 443: "are" should be corrected to "were" A74: We will correct the text as suggested.

Discussion:
References to figures and tables should be kept to an absolute minimum in the Discussion section (too many references are, for example, in l. 499, l. 501-503, l. 504-526, l. 545, l. 575-582). An informative summary of the results that are discussed in the following is sufficient., since the results were already described in detail in the Results section. Section 4.1.2 should be shortened. Not every change in gas fluxes must be explained in great detail. A more general description of the assumed processes and possible causes would be more useful.
A75: We note that this is a long article, and it is easier for readers to not have to search for results that are referenced in the text. However, we agree that several of these can be removed and that Section 4.1.2 should be shortened to highlight key changes.
l. 494/495: This section is confusing. According to the Materials and methods section, the whole soil material was pre-incubated (not only the control soil).
A76: Yes, all soil material was preincubated. We have changed this sentence to make it clearer: "In contrast, the control soils not only had no ryegrass amendment but were also pre-incubated (further decreasing the amount of labile carbon present by the time the incubation started)." l. 500-503: Shorter; the information provided here can be combined.