Code:fao usda.csv

From the change wiki

fao_usda.csv is a cross-reference table, to connect data from two sources:

  1. FAO Primary Crop Production
  2. USDA Nutrition Data, SR Legacy

Purpose: to help analyze crop production in terms of calories, protein, etc.

FAO data specifies crop production in tonnes, but for some crops, this includes some inedible parts. For example a ton of bananas includes the peels; a ton of apples includes the cores.

USDA data specifies the nutrition per 100g edible portion of a food. It also usually specifies "percent refuse" which shows, for example, what percent of a banana is the peel. But sometimes the data is missing - for example, peanut shells.

The Multiplier column covers those cases. Oilseeds also use the Multiplier (for example, as the fraction of cotton harvest that is cottonseed oil).

Table columns:

  1. `Item Code (FAO)` -- FAO crop ID
  2. `NDB_No` -- USDA food ID
  3. `Multiplier`

This table is in the Code: namespace because (even though it's a .csv data table) it was made "in-house" here on this wiki. It was not a transformation of some other data file.

Code

"Item Code (FAO)","NDB_No","Multiplier"
"221","10261","1"
"711","02002","1"
"515","09003","1"
"526","09021","1"
"366","11007","1"
"367","11011","1"
"572","09037","1"
"203","16056","1"
"486","09040","1"
"44","20004","1"
"176","16014","1"
"414","11052","1"
"558","09302","1"
"552","09050","1"
"216","12078","1"
"181","16052","1"
"89","20008","1"
"358","11109","1"
"101","12021","1"
"461","16055","1"
"426","11124","1"
"217","12085","0.30"
"591","09003","1"
"125","11134","1"
"378","11568","1"
"393","11135","1"
"108","20088","1"
"531","09070","1"
"530","09063","1"
"220","12097","1"
"191","16056","1"
"459","11154","1"
"689","02009","1"
"401","11333","1"
"693","02010","1"
"698","02011","1"
"661","19078","1"
"249","12104","1"
"328","04502","0.096"
"195","16062","1"
"554","09078","1"
"397","11205","1"
"550","09083","1"
"577","09087","1"
"399","11209","1"
"569","09089","1"
"94","20036","1"
"512","09149","1"
"619","09174","1"
"542","09003","1"
"541","09279","1"
"603","09422","1"
"406","11215","1"
"720","11216","1"
"549","09107","1"
"103","20062","1"
"507","09111","1"
"560","09132","1"
"242","16087","0.70"
"225","12120","1"
"336","12012","0.50"
"263","04536","0.25"
"592","09148","1"
"407","11246","1"
"497","09150","1"
"201","16069","1"
"372","11251","1"
"333","12220","0.833"
"210","16076","1"
"56","20014","1"
"446","11167","1"
"571","09176","1"
"568","09181","1"
"299","12174","1"
"79","20031","1"
"449","11260","1"
"292","02024","1"
"702","02025","1"
"234","12142","1"
"75","20038","1"
"254","04055","0.1968856"
"339","04517","0.3"
"430","11278","1"
"260","09193","2.0"
"403","11282","1"
"402","11677","1"
"490","09200","1"
"600","09226","1"
"534","09236","1"
"521","09252","1"
"187","16085","1"
"417","11304","1"
"687","02030","1"
"748","02064","1"
"587","09263","1"
"197","16101","1"
"574","09266","1"
"223","12151","1"
"489","09277","1"
"536","09279","1"
"296","02033","1"
"116","11352","1"
"211","16001","1"
"394","11422","1"
"523","09296","1"
"92","20035","1"
"270","04582","0.38"
"547","09302","1"
"27","20040","0.667"
"149","11697","1"
"71","20062","1"
"280","12021","1"
"289","12023","1"
"83","20067","1"
"236","16108","1"
"723","02015","1"
"373","11457","1"
"544","09316","1"
"423","11052","1"
"157","19335","0.16"
"156","19334","0.13"
"161","19335","0.10"
"267","12036","1"
"122","11507","1"
"305","04034","0.30"
"495","09218","1"
"136","11518","1"
"388","11529","1"
"97","20069","1"
"463","11429","1"
"420","11088","1"
"205","11143","1"
"222","12155","1"
"567","09326","1"
"15","20080","1"
"137","11601","1"
"135","11991","1"

Methodology

800	Agave fibres nes - skip. fibre crop.
221	Almonds, with shell - 12061
711	Anise, badian, fennel, coriander - 02002
515	Apples - 09003
526	Apricots - 09021
226	Areca nuts - skip because they aren't really used as a calorie/protein source, and there is no USDA data on them
366	Artichokes - 11007
367 Asparagus - 11011
572	Avocados - 09037
203	Bambara beans - 16056 [3]
486	Bananas - 09040
44	Barley - 20004
782	Bastfibres, other - skip. fibre crop.
176	Beans, dry - 16014 (i chose black beans as the standard)
414	Beans, green - 11052
558	Berries nes - 09302
552	Blueberries - 09050
216	Brazil nuts, with shell - 12078
181	Broad beans, horse beans, dry - 16052
89	Buckwheat - 20008
358	Cabbages and other brassicas - 11109
101	Canary seed - 12021
461	Carobs - 16055
426	Carrots and turnips - 11124
217	Cashew nuts, with shell - 12085 - Multiplier=0.30 [2]
591	Cashewapple - 09003 [4]
125	Cassava - 11134
378	Cassava leaves - 11568
265	Castor oil seed - skip because it's toxic to eat
393	Cauliflowers and broccoli - 11135 (cauliflower) and 11090 (broccoli)
108	Cereals nes - 20088
531	Cherries - 09070
530	Cherries, sour - 09063
220	Chestnut - 12097
191	Chick peas - 16056
459	Chicory roots - 11154
689	Chillies and peppers, dry - 02009
401	Chillies and peppers, green - 11333 (bell pepper) or 11979 (jalapeno)
693	Cinnamon (cannella) - 02010
698	Cloves - 02011
661	Cocoa, beans - 19078 (no USDA data on cacao beans, so just using baking chocolate (100% cacao))
249	Coconuts - 12104
656	Coffee, green - couldn't find USDA for actual coffee beans. just use 14210 (espresso)
813	Coir - fibre. skip.
195	Cow peas, dry - 16062
554	Cranberries - 09078
397	Cucumbers and gherkins - 11205
550	Currants - 09083 (black) or 09084 (red n white)
577	Dates - 09087 i guess; hope the difference between dried n fresh isnt significant
399	Eggplants (aubergines) - 11209
821	Fibre crops nes - skip
569	Figs - 09089
773	Flax fibre and tow - skip
94	Fonio - 20036 (just using brown rice, close enough)
512	Fruit, citrus nes - 09149
619	Fruit, fresh nes - 09174
542	Fruit, pome nes - 09003
541	Fruit, stone nes - 09279
603	Fruit, tropical fresh nes - 09422
406	Garlic - 11215
720	Ginger - 11216
549	Gooseberries - 09107
103	Grain, mixed - 20062
507	Grapefruit (inc. pomelos) - 09111
560	Grapes - 09132
242	Groundnuts, with shell - 16087 - Multiplier=0.70 [2]
225	Hazelnuts, with shell - 12120
777	Hemp tow waste - skip. fibre
336	Hempseed - 12012. Multiplier=0.5 [5]
677	Hops - no usda, and probably not much of the nutritional value gets used. so skip
277	Jojoba seed - oil is not typically eaten, so skip
780	Jute - skip. fibre
778	Kapok fibre - skip
310	Kapok fruit - 09176 (no USDA data available; just gonna use mango)
263	Karite nuts (sheanuts) - 04536 - Multiplier=0.25 [1]
592	Kiwi fruit - 09148
224	Kola nuts - skip bc i cant find nutrition data aside from the fact that it's 2-3.5% caffeine. anyway flavor not food
407	Leeks, other alliaceous vegetables - 11246
497	Lemons and limes - 09150
201	Lentils - 16069
372	Lettuce and chicory - 11251
333	Linseed - 12220 - Multiplier=0.833 [6]
210	Lupins - 16076
56	Maize - 20014
446	Maize, green - 11167
571	Mangoes, mangosteens, guavas - 09176
809	Manila fibre (abaca) - skip
671	Mat� - skip bc it doesnt significantly contribute any calories or protein
568	Melons, other (inc.cantaloupes) - 09181
299	Melonseed - 12174 close enough
79	Millet - 20031 but hopefully doesnt include the hulls
449	Mushrooms and truffles - 11260
292	Mustard seed - 02024
702	Nutmeg, mace and cardamoms - 02025
234	Nuts nes - 12142
75	Oats - 20038
254	Oil palm fruit - Multiplier=0.1968856 [7]
339	Oilseeds nes - 04517 - Multiplier=0.3 [1]
430	Okra - 11278
260	Olives - 09193 - Multiplier=2.0 [8]
403	Onions, dry - 11282
402	Onions, shallots, green - 11677
490	Oranges - 09200
600	Papayas - 09226
534	Peaches and nectarines - 09236
521	Pears - 09252
187	Peas, dry - 16085 (usda is split, which is missing the hulls, but it's probably less than 1% difference, and besides, the hulls are edible)
417	Peas, green - 11304 i think, make sure USDA shows it's mostly water
687	Pepper (piper spp.) - 02030
748	Peppermint - 02064
587	Persimmons - 09263
197	Pigeon peas - 16101 for mature seeds (probably dry) vs 11344 for immature seeds (prob hi moisture). check yields to guess which one should be used
574	Pineapples - 09266
223	Pistachios - 12151
489	Plantains and others - 09277
536	Plums and sloes - 09279
296	Poppy seed - 02033
116	Potatoes - 11352
211	Pulses nes - 16001 (adzuki)
394	Pumpkins, squash and gourds - 11422 (pumpkin) or 11467 (squash) or 11218 (gourd)
754	Pyrethrum, dried - skip bc it's not a food
523	Quinces - 09296
92	Quinoa - 20035
788	Ramie - skip. fibre
270	Rapeseed - 04582 - Multiplier=0.38 [1]
547	Raspberries - 09302
27 Rice, paddy - 20040 - Multiplier=0.667 [1] - The Multiplier can also be confirmed by looking at the data: For every row of "Rice, paddy", there is an equivalent row of "Rice, paddy (rice milled equivalent)" with the value multiplied by 0.667.
30 Rice, paddy (rice milled equivalent) - Skip to avoid double counting.  While you might think it would be better to use ''this row'' instead of "Rice, paddy" (to avoid needing a Multiplier), this row doesn't have all the data elements - it lacks "Area harvested" and "Yield".
149	Roots and tubers nes - 11697
836	Rubber, natural - skip. inedible
71	Rye - 20062
280	Safflower seed - 12021 - assuming the FAO data is for seeds in hulls
328	Seed cotton - Multiplier=0.096 [9]
289	Sesame seed - 12023
789	Sisal - skip. fibre
83	Sorghum - 20067
236	Soybeans - 16108
723	Spices nes - 02015 - using 'curry powder' because it's a mix of spices, so most likely to have an 'average' nutrition profile.
373	Spinach - 11457
544	Strawberries - 09316
423	String beans - 11052
157	Sugar beet - 19335 - Multiplier=0.16 [10]
156	Sugar cane - 19334 - Multiplier=0.13 [11]
161	Sugar crops nes - 19335 - Multiplier=0.10 but this is completely arbitrary. no data available [12]
267	Sunflower seed - 12036 - assuming FAO counts the shells
122	Sweet potatoes - 11507
305	Tallowtree seed - 04034 - Multiplier=0.30 [1]
495	Tangerines, mandarins, clementines, satsumas - 09218
136	Taro (cocoyam) - 11518
667	Tea - skip bc not a significant source of calories or protein
826	Tobacco, unmanufactured - skip for the same reason as tea
388	Tomatoes - 11529
97	Triticale - 20069
275	Tung nuts - skip bc not used for food; the oil is for wood finishes
692	Vanilla - skip bc it's just a flavoring; nutrition is insignificant
463	Vegetables, fresh nes - 11429
420	Vegetables, leguminous nes - 11088
205	Vetches - 11143 - just using celery - not enough data on vetches [13]
222	Walnuts, with shell - 12155
567	Watermelons - 09326
15	Wheat - 20080
137	Yams - 11601
135	Yautia (cocoyam) - 11991

footnotes:
[1] FAOSTAT, "Definitions and standards - Crops - Item Group - Oilcrops, Oil Equivalent" [https://www.fao.org/faostat/en/#data/QCL]
[2] FAOSTAT, "Definitions and standards - Crops - Item"                                  [https://www.fao.org/faostat/en/#data/QCL]
[3] Bambara beans are "known as a 'complete food' as the seeds contain on average 63% carbohydrate, 19% protein and 6.5% fat, making it a very important source of dietary protein." - http://www.fao.org/traditional-crops/bambaragroundnut/en/ . This nutritional profile is remarkably similar to chick peas, so that's the USDA data I chose to use.
[4] Cashewapple nutrition facts are found in https://hort.purdue.edu/newcrop/morton/cashew_apple.html. Looks remarkably similar to USDA data for apples.
[5] from "Hempseed in food industry: Nutritional value, health benefits, and industrial applications - "https://onlinelibrary.wiley.com/doi/full/10.1111/1541-4337.12517 : the Multiplier 0.5 is interred from the fact that measurements of protein, fat, and oil - in "whole seed" - are roughly at the midpoint between "dehulled seed" and "hemp hulls". Note: the hulls do in fact have some protein and oils, but the fibre content is so high, i'm counting them as inedible for humans. They may in fact have a place in the human diet. Note: btw, another data source is comparing protein and oil content of shelled and unshelled hemp seeds: https://www.aocs.org/stay-informed/inform-magazine/featured-articles/hempseed-oil-in-a-nutshell-march-2010 see image1 chart
[6] Linseed: USDA says flaxseed is 42% fat, but FAO [1] says 35%. Assume the discrepency is due to fao "linseed" data including hulls (flaxseed nutrition data does not). So let's use a Multiplier of 0.833 (=35%/42%)
[7] Oil palm fruit: To get the Multiplier, we look at a different FAO dataset: [[:File:fao-crops-processed.csv]], which includes the production of palm oil (item code 257) and palm kernel oil (item code 256). Add those two together, and divide by the ''oil palm fruit'' of this dataset, and you get the Multiplier. Using the most recent 2 years where data is available (2018 and 2019), we get (71735061 + 74583225 + 7918019 + 8226464) / (409265212 + 415898058) = 0.1968856. For simplicity sake, we count all the oil as palm oil, even though a small percent is palm ''kernel'' oil. The latter has a bit more saturated fat, but the calories are the same.
[8] Olives: USDA data is only available for canned olives. Such data shows 10% oil. But FAO [1] says olives are 22% oil. If we assume that the discrepency is entirely because the canned olives are diluted with saltwater... then we use a Multiplier of 2.2. But then again, olives intended for oil probably have more oil whereas olives for canning might have more starch and/or protein. So I'm lowering the Multiplier to 2.0.
[9] Seed cotton is "55%-65%" seeds [2], and cottonseeds are 16% oil [1]. Thus, Multiplier = 60% * 16% = 9.6% = 0.096
[10] "Sugar beets contain in average 16 % sugar, 80 % of which can be recovered by the extraction process [...] remaining sugar (non-crystallised) are left with the molasses." - http://www.fao.org/3/a-ae377e.pdf . Let's count all 16%, but count it at granulated sucrose for simplicity sake. There is no USDA data on beet molasses anyway.
[11] "The sugar content of sugar cane ranges from 10 to 15 percent of the total weight" - http://www.fao.org/es/faodef/fdef03e.HTM
[12] Sugar crops nes: includes sugar maple, sugar palm, and sweet sorghum. "In the case of saps, production is to be expressed in liquid equivalent." but who knows what that means. Maple sap is about 6% sugar. Sugar palm is probably more. And sorghum is probably over 60% after hydrolysing the starches... but it's probably already counted under "83. Sorghum" - how am I supposed to know? Anyway, I'm just gonna say sugar crops nes are 10% sugar. Might be wrong, but it's not a big part of total production so whatever.
[13] Vetches - too many unknowns: what parts of the crop can be eaten by humans? vs by animals? what parts does the FAO harvest weight include? where can i find nutrition data??? Screw it, I'm just gonna use USDA celery data and call it a day.

other considerations:
Oilseeds have other nutrients besides oil. Those nutrients could potentially be used in food. But for some oilseeds (canola, cotton, palm), they currently are not. And the USDA nutrition data is only available for their refined oils. Therefore, this analysis only counts their oils. Hopefully we can explore the greater potential of oilseeds in some other analysis.