Skip to content

doc_reader example

Austin Almond edited this page Apr 20, 2018 · 5 revisions

Example output of DocReader.read_docs:

Python dict

topics_data = {'topics': [{'title': ' Giant Panda ', 'category': '4', 'id': 'D1003A', 'docset': {'AFP_ENG_20050128.0218': {'type': 'story', 'keywords': [], 'paragraphs': ["China will soon finish building its \nfirst blood bank for pandas, which will assist researchers in \nstudying the endangered animals' blood types and chances of \naccepting blood transfusions, state media said Friday.", "The bank at southwest China's Giant Panda Protection and \nResearch Centre in the Wolong Nature Reserve in Sichuan province \nwill be completed this year, the China Daily said.", 'Located in the giant panda breeding lab, the bank will help \nresearchers answer questions such as how many blood types pandas \nhave and whether they reject blood transfusions, centre sources \nsaid.', 'Initial studies have found that pandas have different blood \ntypes, but researchers have not conducted in-depth studies and lack \nsufficient knowledge about this, centre deputy chief engineer Huang \nYan said.', 'The centre will hold a general survey of the blood types of all \nthe 81 pandas being kept there, and collect and store their blood to \nbetter prepare for the protection and rescue of pandas in the wild.', 'To boost their blood, researchers currently give pandas \ninjections of glucose and medicine, but blood transfusions are more \neffective in helping the animals improve their immunity and \naccelerate the process of recovery.', "A blood bank can also simplify the process of breeding pandas \nbecause data on the pandas' blood types and DNA information will be \nreadily available to bring more diversity to the gene pool.", 'China has 163 giant pandas in captivity.', 'There are only about 1,590 of the endangered species living in \nthe wild, all in China, though the numbers have risen steadily in \nthe past decade after plunging to around 1,100 in the 1980s.'], 'headlines': ["China soon to complete country's first blood bank for pandas"], 'datelines': ['BEIJING, Jan 28']}, 'XIN_ENG_20041019.0235': {'type': 'story', 'keywords': [], 'paragraphs': ['Three giant pandas living in Beijing Zoo will be sent to Wolong of\nSichuan Province, southwest China, Wednesday and three others in\nSichuan will fly to Beijing Friday, under an exchange program aimed\nto maintain the biodiversity of the giant panda population.', '"It is impossible for giant pandas fed in captivity to survive\nthrough natural selection, which will result in similar heredities, "\nsaid a zookeeper. "The inbreeding among giant pandas in the same\nareas easily leads to the species\' degeneration."', 'There are 11 giant pandas in Beijing Zoo and only one was caught in\nthe wild, with the other ten artificially bred.', 'The three Beijing giant pandas about to leave were born respectively\nin 1991, 1999 and 2003, and one of the three Sichuan pandas was also\nborn last year.', 'Wolong is a famous giant panda habitat where the world-known China\nConservation and Research Center of the Giant Panda is located.'], 'headlines': ['Beijing, Sichuan to exchange giant pandas'], 'datelines': ['BEIJING, Oct. 19 (Xinhua)']}}}]}

Python Dict to JSON string

Assume the above dict is saved to topics_data. This code will write that data to a JSON file:

import json

with open('f.json', 'w') as outfile:
  json.dump(topics_data, outfile)

This will cause f.json to contain the following:

{"topics":[{"title":" Giant Panda ","id":"D1003A","category":"4","docset":{"XIN_ENG_20041019.0235":{"keywords":[],"headlines":["Beijing, Sichuan to exchange giant pandas"],"type":"story","datelines":["BEIJING, Oct. 19 (Xinhua)"],"paragraphs":["Three giant pandas living in Beijing Zoo will be sent to Wolong of\nSichuan Province, southwest China, Wednesday and three others in\nSichuan will fly to Beijing Friday, under an exchange program aimed\nto maintain the biodiversity of the giant panda population.","\"It is impossible for giant pandas fed in captivity to survive\nthrough natural selection, which will result in similar heredities, \"\nsaid a zookeeper. \"The inbreeding among giant pandas in the same\nareas easily leads to the species' degeneration.\"","There are 11 giant pandas in Beijing Zoo and only one was caught in\nthe wild, with the other ten artificially bred.","The three Beijing giant pandas about to leave were born respectively\nin 1991, 1999 and 2003, and one of the three Sichuan pandas was also\nborn last year.","Wolong is a famous giant panda habitat where the world-known China\nConservation and Research Center of the Giant Panda is located."]},"AFP_ENG_20050128.0218":{"keywords":[],"headlines":["China soon to complete country's first blood bank for pandas"],"type":"story","datelines":["BEIJING, Jan 28"],"paragraphs":["China will soon finish building its \nfirst blood bank for pandas, which will assist researchers in \nstudying the endangered animals' blood types and chances of \naccepting blood transfusions, state media said Friday.","The bank at southwest China's Giant Panda Protection and \nResearch Centre in the Wolong Nature Reserve in Sichuan province \nwill be completed this year, the China Daily said.","Located in the giant panda breeding lab, the bank will help \nresearchers answer questions such as how many blood types pandas \nhave and whether they reject blood transfusions, centre sources \nsaid.","Initial studies have found that pandas have different blood \ntypes, but researchers have not conducted in-depth studies and lack \nsufficient knowledge about this, centre deputy chief engineer Huang \nYan said.","The centre will hold a general survey of the blood types of all \nthe 81 pandas being kept there, and collect and store their blood to \nbetter prepare for the protection and rescue of pandas in the wild.","To boost their blood, researchers currently give pandas \ninjections of glucose and medicine, but blood transfusions are more \neffective in helping the animals improve their immunity and \naccelerate the process of recovery.","A blood bank can also simplify the process of breeding pandas \nbecause data on the pandas' blood types and DNA information will be \nreadily available to bring more diversity to the gene pool.","China has 163 giant pandas in captivity.","There are only about 1,590 of the endangered species living in \nthe wild, all in China, though the numbers have risen steadily in \nthe past decade after plunging to around 1,100 in the 1980s."]}}}]}

As JSON with whitespace added

Whitespace added, for readability (generated via https://jsonformatter.curiousconcept.com/):

{
  "topics":[
    {
      "title":" Giant Panda ",
      "id":"D1003A",
      "category":"4",
      "docset":{
        "XIN_ENG_20041019.0235":{
          "keywords":[

          ],
          "headlines":[
            "Beijing, Sichuan to exchange giant pandas"
          ],
          "type":"story",
          "datelines":[
            "BEIJING, Oct. 19 (Xinhua)"
          ],
          "paragraphs":[
            "Three giant pandas living in Beijing Zoo will be sent to Wolong of\nSichuan Province, southwest China, Wednesday and three others in\nSichuan will fly to Beijing Friday, under an exchange program aimed\nto maintain the biodiversity of the giant panda population.",
            "\"It is impossible for giant pandas fed in captivity to survive\nthrough natural selection, which will result in similar heredities, \"\nsaid a zookeeper. \"The inbreeding among giant pandas in the same\nareas easily leads to the species' degeneration.\"",
            "There are 11 giant pandas in Beijing Zoo and only one was caught in\nthe wild, with the other ten artificially bred.",
            "The three Beijing giant pandas about to leave were born respectively\nin 1991, 1999 and 2003, and one of the three Sichuan pandas was also\nborn last year.",
            "Wolong is a famous giant panda habitat where the world-known China\nConservation and Research Center of the Giant Panda is located."
          ]
        },
        "AFP_ENG_20050128.0218":{
          "keywords":[

          ],
          "headlines":[
            "China soon to complete country's first blood bank for pandas"
          ],
          "type":"story",
          "datelines":[
            "BEIJING, Jan 28"
          ],
          "paragraphs":[
            "China will soon finish building its \nfirst blood bank for pandas, which will assist researchers in \nstudying the endangered animals' blood types and chances of \naccepting blood transfusions, state media said Friday.",
            "The bank at southwest China's Giant Panda Protection and \nResearch Centre in the Wolong Nature Reserve in Sichuan province \nwill be completed this year, the China Daily said.",
            "Located in the giant panda breeding lab, the bank will help \nresearchers answer questions such as how many blood types pandas \nhave and whether they reject blood transfusions, centre sources \nsaid.",
            "Initial studies have found that pandas have different blood \ntypes, but researchers have not conducted in-depth studies and lack \nsufficient knowledge about this, centre deputy chief engineer Huang \nYan said.",
            "The centre will hold a general survey of the blood types of all \nthe 81 pandas being kept there, and collect and store their blood to \nbetter prepare for the protection and rescue of pandas in the wild.",
            "To boost their blood, researchers currently give pandas \ninjections of glucose and medicine, but blood transfusions are more \neffective in helping the animals improve their immunity and \naccelerate the process of recovery.",
            "A blood bank can also simplify the process of breeding pandas \nbecause data on the pandas' blood types and DNA information will be \nreadily available to bring more diversity to the gene pool.",
            "China has 163 giant pandas in captivity.",
            "There are only about 1,590 of the endangered species living in \nthe wild, all in China, though the numbers have risen steadily in \nthe past decade after plunging to around 1,100 in the 1980s."
          ]
        }
      }
    }
  ]
}

Note that both python dicts and JSON objects may re-order keys randomly; do not rely on keys to be in any particular order within a JSON object.

JSON string to Python dict

This will cause topics_data to have the same value as the first Python dict on this page.

import json

with open('f.json', 'r') as infile:
  topics_data = json.load(infile)

Clone this wiki locally