Table of Contents

How To Generate Google Sitemap Using MongoDB and Python

In this post, I will discuss about generating sitemap with the help of mongoDB. For that, I am assuming you will have all the URLs stored in the mongoDB.

I will use pymongo to query mongoDB. If you dont' have it. Install it using pip

pip install pymongo

Lets initiate the connection to mongoDB using pymongo now

from pymongo import MongoClient
client = MongoClient('mongodb://localhost:27017')
db = client['alternatedb'] 

I have my URLS stored as slugs in the following collection...

db.iosapp_slugs.find_one()
{'_id': ObjectId('5d518ca585468955cd403cb2'),
 'iosapp_id': 'id1016307224',
 'iosapp_slug': 'ios-app-Shark-Fishing-Extreme-Games-Free'}

Lets create the header first of sitemap.xml

header = '''<?xml version="1.0" encoding="UTF-8"?> 
<urlset \
   xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" 
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
   xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 
      http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
'''

Lets loop through the collection to create sitemap.xml now...

with open('sitemap.xml','w') as fp:
  fp.write('%s\n'%header)
  for doc in db.iosapp_slugs.find({}):
    slug = doc['iosapp_slug']
    fp.write('<url>\n')
    fp.write('<loc>https://www.example.com/%s/</loc>\n'%slug)
    fp.write('<priority>%s</priority>\n'%('1.00'))
    fp.write('</url>\n')
  fp.write('</urlset>\n')

If you have lot of URLS in mongo, you might run in to following error...

"errmsg": "Cursor not found, cursor id: ###",

To fix that add the batch_size to mongoDB loop command as shown below...

for doc in db.iosapp_slugs.find({}).batch_size(100):

If you are using Python, You might be using Django, Here is how to serve sitemap.xml in Django

from django.views.generic import TemplateView

Then in urlpatterns list add following...

url(r'^sitemap\.xml/$', TemplateView.as_view(template_name="sitemap.xml", content_type='text/xml'))


Related Posts

1
2