-
Notifications
You must be signed in to change notification settings - Fork 0
doc_reader example
Austin Almond edited this page Apr 20, 2018
·
5 revisions
Example output of DocReader.read_docs:
topics_data = {'topics': [{'title': ' Giant Panda ', 'category': '4', 'id': 'D1003A', 'docset': {'AFP_ENG_20050128.0218': {'type': 'story', 'keywords': [], 'paragraphs': ["China will soon finish building its \nfirst blood bank for pandas, which will assist researchers in \nstudying the endangered animals' blood types and chances of \naccepting blood transfusions, state media said Friday.", "The bank at southwest China's Giant Panda Protection and \nResearch Centre in the Wolong Nature Reserve in Sichuan province \nwill be completed this year, the China Daily said.", 'Located in the giant panda breeding lab, the bank will help \nresearchers answer questions such as how many blood types pandas \nhave and whether they reject blood transfusions, centre sources \nsaid.', 'Initial studies have found that pandas have different blood \ntypes, but researchers have not conducted in-depth studies and lack \nsufficient knowledge about this, centre deputy chief engineer Huang \nYan said.', 'The centre will hold a general survey of the blood types of all \nthe 81 pandas being kept there, and collect and store their blood to \nbetter prepare for the protection and rescue of pandas in the wild.', 'To boost their blood, researchers currently give pandas \ninjections of glucose and medicine, but blood transfusions are more \neffective in helping the animals improve their immunity and \naccelerate the process of recovery.', "A blood bank can also simplify the process of breeding pandas \nbecause data on the pandas' blood types and DNA information will be \nreadily available to bring more diversity to the gene pool.", 'China has 163 giant pandas in captivity.', 'There are only about 1,590 of the endangered species living in \nthe wild, all in China, though the numbers have risen steadily in \nthe past decade after plunging to around 1,100 in the 1980s.'], 'headlines': ["China soon to complete country's first blood bank for pandas"], 'datelines': ['BEIJING, Jan 28']}, 'XIN_ENG_20041019.0235': {'type': 'story', 'keywords': [], 'paragraphs': ['Three giant pandas living in Beijing Zoo will be sent to Wolong of\nSichuan Province, southwest China, Wednesday and three others in\nSichuan will fly to Beijing Friday, under an exchange program aimed\nto maintain the biodiversity of the giant panda population.', '"It is impossible for giant pandas fed in captivity to survive\nthrough natural selection, which will result in similar heredities, "\nsaid a zookeeper. "The inbreeding among giant pandas in the same\nareas easily leads to the species\' degeneration."', 'There are 11 giant pandas in Beijing Zoo and only one was caught in\nthe wild, with the other ten artificially bred.', 'The three Beijing giant pandas about to leave were born respectively\nin 1991, 1999 and 2003, and one of the three Sichuan pandas was also\nborn last year.', 'Wolong is a famous giant panda habitat where the world-known China\nConservation and Research Center of the Giant Panda is located.'], 'headlines': ['Beijing, Sichuan to exchange giant pandas'], 'datelines': ['BEIJING, Oct. 19 (Xinhua)']}}}]}Assume the above dict is saved to topics_data. This code will write that data to a JSON file:
import json
with open('f.json', 'w') as outfile:
json.dump(topics_data, outfile)This will cause f.json to contain the following:
{"topics":[{"title":" Giant Panda ","id":"D1003A","category":"4","docset":{"XIN_ENG_20041019.0235":{"keywords":[],"headlines":["Beijing, Sichuan to exchange giant pandas"],"type":"story","datelines":["BEIJING, Oct. 19 (Xinhua)"],"paragraphs":["Three giant pandas living in Beijing Zoo will be sent to Wolong of\nSichuan Province, southwest China, Wednesday and three others in\nSichuan will fly to Beijing Friday, under an exchange program aimed\nto maintain the biodiversity of the giant panda population.","\"It is impossible for giant pandas fed in captivity to survive\nthrough natural selection, which will result in similar heredities, \"\nsaid a zookeeper. \"The inbreeding among giant pandas in the same\nareas easily leads to the species' degeneration.\"","There are 11 giant pandas in Beijing Zoo and only one was caught in\nthe wild, with the other ten artificially bred.","The three Beijing giant pandas about to leave were born respectively\nin 1991, 1999 and 2003, and one of the three Sichuan pandas was also\nborn last year.","Wolong is a famous giant panda habitat where the world-known China\nConservation and Research Center of the Giant Panda is located."]},"AFP_ENG_20050128.0218":{"keywords":[],"headlines":["China soon to complete country's first blood bank for pandas"],"type":"story","datelines":["BEIJING, Jan 28"],"paragraphs":["China will soon finish building its \nfirst blood bank for pandas, which will assist researchers in \nstudying the endangered animals' blood types and chances of \naccepting blood transfusions, state media said Friday.","The bank at southwest China's Giant Panda Protection and \nResearch Centre in the Wolong Nature Reserve in Sichuan province \nwill be completed this year, the China Daily said.","Located in the giant panda breeding lab, the bank will help \nresearchers answer questions such as how many blood types pandas \nhave and whether they reject blood transfusions, centre sources \nsaid.","Initial studies have found that pandas have different blood \ntypes, but researchers have not conducted in-depth studies and lack \nsufficient knowledge about this, centre deputy chief engineer Huang \nYan said.","The centre will hold a general survey of the blood types of all \nthe 81 pandas being kept there, and collect and store their blood to \nbetter prepare for the protection and rescue of pandas in the wild.","To boost their blood, researchers currently give pandas \ninjections of glucose and medicine, but blood transfusions are more \neffective in helping the animals improve their immunity and \naccelerate the process of recovery.","A blood bank can also simplify the process of breeding pandas \nbecause data on the pandas' blood types and DNA information will be \nreadily available to bring more diversity to the gene pool.","China has 163 giant pandas in captivity.","There are only about 1,590 of the endangered species living in \nthe wild, all in China, though the numbers have risen steadily in \nthe past decade after plunging to around 1,100 in the 1980s."]}}}]}Whitespace added, for readability (generated via https://jsonformatter.curiousconcept.com/):
{
"topics":[
{
"title":" Giant Panda ",
"id":"D1003A",
"category":"4",
"docset":{
"XIN_ENG_20041019.0235":{
"keywords":[
],
"headlines":[
"Beijing, Sichuan to exchange giant pandas"
],
"type":"story",
"datelines":[
"BEIJING, Oct. 19 (Xinhua)"
],
"paragraphs":[
"Three giant pandas living in Beijing Zoo will be sent to Wolong of\nSichuan Province, southwest China, Wednesday and three others in\nSichuan will fly to Beijing Friday, under an exchange program aimed\nto maintain the biodiversity of the giant panda population.",
"\"It is impossible for giant pandas fed in captivity to survive\nthrough natural selection, which will result in similar heredities, \"\nsaid a zookeeper. \"The inbreeding among giant pandas in the same\nareas easily leads to the species' degeneration.\"",
"There are 11 giant pandas in Beijing Zoo and only one was caught in\nthe wild, with the other ten artificially bred.",
"The three Beijing giant pandas about to leave were born respectively\nin 1991, 1999 and 2003, and one of the three Sichuan pandas was also\nborn last year.",
"Wolong is a famous giant panda habitat where the world-known China\nConservation and Research Center of the Giant Panda is located."
]
},
"AFP_ENG_20050128.0218":{
"keywords":[
],
"headlines":[
"China soon to complete country's first blood bank for pandas"
],
"type":"story",
"datelines":[
"BEIJING, Jan 28"
],
"paragraphs":[
"China will soon finish building its \nfirst blood bank for pandas, which will assist researchers in \nstudying the endangered animals' blood types and chances of \naccepting blood transfusions, state media said Friday.",
"The bank at southwest China's Giant Panda Protection and \nResearch Centre in the Wolong Nature Reserve in Sichuan province \nwill be completed this year, the China Daily said.",
"Located in the giant panda breeding lab, the bank will help \nresearchers answer questions such as how many blood types pandas \nhave and whether they reject blood transfusions, centre sources \nsaid.",
"Initial studies have found that pandas have different blood \ntypes, but researchers have not conducted in-depth studies and lack \nsufficient knowledge about this, centre deputy chief engineer Huang \nYan said.",
"The centre will hold a general survey of the blood types of all \nthe 81 pandas being kept there, and collect and store their blood to \nbetter prepare for the protection and rescue of pandas in the wild.",
"To boost their blood, researchers currently give pandas \ninjections of glucose and medicine, but blood transfusions are more \neffective in helping the animals improve their immunity and \naccelerate the process of recovery.",
"A blood bank can also simplify the process of breeding pandas \nbecause data on the pandas' blood types and DNA information will be \nreadily available to bring more diversity to the gene pool.",
"China has 163 giant pandas in captivity.",
"There are only about 1,590 of the endangered species living in \nthe wild, all in China, though the numbers have risen steadily in \nthe past decade after plunging to around 1,100 in the 1980s."
]
}
}
}
]
}Note that both python dicts and JSON objects may re-order keys randomly; do not rely on keys to be in any particular order within a JSON object.
This will cause topics_data to have the same value as the first Python dict on this page.
import json
with open('f.json', 'r') as infile:
topics_data = json.load(infile)