Price Tracker Application with Django: Crawling Discounts From Ebay

Written by thedevtimeline | Published 2019/12/13
Tech Story Tags: python-web-development | django | python | tutorial | productivity | beginners | software-development | python-top-story

TLDR In this tutorial we are going to build price tracker application which will notify us for discounts. We will use RabbitMQ + Celery + BeautifulSoup + Django for creating this app. We are using Celery to handle the time-consuming tasks by passing them to queue to be executed in the background. Celery is the best choice for doing background task processing in the Python/Django ecosystem. You can clone this project from my GitHub repository below:https://github.com/raszidzie/Price-Tracker-Application-Application.via the TL;DR App

What's up Hackers!
In this tutorial we are going to build price tracker application which will notify us for discounts.
Creating complex projects is the key to learn fast! If you are following me for a while, you already know I like complex stuff, so we are going to use RabbitMQ + Celery + BeautifulSoup + Django for creating this app.
Alright! Let's Start!
Celery is the best choice for doing background task processing in the Python/Django ecosystem. It has a simple and clear API, and it integrates beautifully with Django. So, we are using Celery to handle the time-consuming tasks by passing them to queue to be executed in the background and always keep the server ready to respond to new requests.
Celery requires a solution to send and receive messages; usually this comes in the form of a separate service called a message broker. We will be configuring celery to use the RabbitMQ messaging system, as it provides robust, stable performance and interacts well with celery.
We can install RabbitMQ through Ubuntu’s repositories by following command:
sudo apt-get install rabbitmq-server
Then enable and start the RabbitMQ service:
sudo systemctl enable rabbitmq-server
sudo systemctl start rabbitmq-server
Install RabbitMQ on Mac

Well, create a new Django project named pricetracker and app named tracker
Install following dependencies:
pip3 install beautifulsoup4 httplib2 Celery
Once installation completed, add the CELERY_BROKER_URL configuration to the settings.py file:
CELERY_BROKER_URL = 'amqp://localhost'
Then, create celery.py inside your project.
celery.py
import os
from celery import Celery

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'pricetracker.settings')

app = Celery('pricetracker')
app.config_from_object('django.conf:settings', namespace='CELERY')
app.autodiscover_tasks()
We are setting the default Django settings module for the 'celery' program and loading task modules from all registered Django app configs.
Now inside your __init__.py import the celery:
from .celery import app as celery_app

__all__ = ['celery_app']
This will make sure our Celery app loaded every time Django starts.
Now, Let's create our model
models.py
from django.db import models

class Item(models.Model):
    title = models.CharField(max_length=200)
    url = models.CharField(max_length=600)
    requested_price = models.IntegerField(default=0)
    last_price = models.IntegerField(null=True, blank=True)
    discount_price = models.CharField(max_length=100, null=True, blank=True)
    date = models.DateTimeField(auto_now_add=True)

    def __str__(self):
        return self.title
We are going to crawl eBay. User will enter URL of specific item and requested price.
So, let's create a form for that:
forms.py
from django import forms

class AddNewItemForm(forms.Form):
    url = forms.CharField(max_length=600)
    requested_price = forms.IntegerField()
We will use BeautifulSoup to crawl price and title of item in given URL. After data crawled we have to convert price to float and create new object in database.
views.py
from urllib.request import urlopen, Request
from bs4 import BeautifulSoup

def crawl_data(url):
    # User Agent is to prevent 403 Forbidden Error
    req = Request(url, headers={'User-Agent': 'Mozilla/5.0'})
    html = urlopen(req).read()
    bs = BeautifulSoup(html, 'html.parser')

    title = bs.find('h1', id="itemTitle").get_text().replace("Details about", "")
    price = bs.find('span', id="prcIsum").get_text()
    clean_price = float(price.strip().replace("US", "").replace("$", ""))
    return {'title': title, 'last_price':clean_price }
strip() removes spaces at the beginning and at the end of the string and replace() method replaces a specified phrase with another specified phrase. So, we used these methods to get clean price and title of item.
Once data crawled successfully, it is time to create new object in database. We will use this function in form submission to crawl title and price of new item.
views.py
from django.shortcuts import render, get_object_or_404,HttpResponseRedirect 
from .models import Item
from .forms import AddNewItemForm

def tracker_view(request):
    items = Item.objects.order_by('-id')
    form = AddNewItemForm(request.POST)
    if request.method == 'POST':
        if form.is_valid():
            url = form.cleaned_data.get('url')
            requested_price = form.cleaned_data.get('requested_price')
            # crawling the data 
            crawled_data = crawl_data(url)
            Item.objects.create(
            url = url,
            title = crawled_data['title'],
            requested_price=requested_price,
            last_price=crawled_data['last_price'],
            discount_price='No Discount Yet',
            )
            return HttpResponseRedirect('')
        else:
            form = AddNewItemForm()
    context = {
        'items':items,
        'form':form,
    }
    return render(request, 'tracker.html', context)
Great! Now, we need to crawl the data for all objects continuously to be aware of discounts. If we do this without celery the server connection will timeout which means that a server is taking too long to reply to a data request and our application will crash.
Create tasks.py in your app and let's handle it with celery tasks.
import time
from celery import shared_task
from .models import Vehicle
from tracker.views import crawl_data

@shared_task
# do something heavy
def track_for_discount():
    items = Item.objects.all()
    for item in items:
        # crawl item url
        data = crawl_data(item.url)
        # check for discount
        if data['last_price'] < item.requested_price:
            print(f'Discount for {data["title"]}')
            # update discount field to notify user
            item_discount = Item.objects.get(id=item.id)
            item_discount.discount_price = f'DISCOUNT! The price is {data["last_price"]}'
            item_discount.save()      
while True:
    track_for_discount()
    time.sleep(15)  
@shared_task will create the independent instance of the task for each app, making task reusable. This makes the @shared_task decorator useful for libraries and reusable apps, since they will not have access to the app of the user.
We are simply crawling data every 15 seconds and comparing last price with requested price. If last price is smaller than requested price then we are updating the discount price field.
What if price will increase again?
@shared_task
def track_for_not_discount():
    items = Item.objects.all()
    for item in items:
        data = crawl_data(item.url)
        if data["last_price"] > item.requested_price:
            print(f'Discount finished for {data["title"]}')
            item_discount_finished = Item.objects.get(id=item.id)
            item_discount_finished.discount_price = 'No Discount Yet'
            item_discount_finished.save()
Great! Now, it will possible to track discounts properly. You can add one more function which will detect closer prices and notify user about it. For instance, if item price is 100$ and requested price is 97$. But let's keep it simple for now.
Finally, we can create our template.
tracker.html
{% extends 'base.html' %}

{% block content %}
<form method="POST">
    {% csrf_token %}
    {{form.as_p}}
    <button class="btn btn-primary" type="submit">Send</button>
</form>
<table class="table">
    <thead>
      <tr>
        <th scope="col">Title</th>
        <th scope="col">Requested Price</th>
        <th scope="col">Last Price</th>
        <th scope="col">Discount Price</th>
        <th scope="col">Date Created</th>
      </tr>
    </thead>
    <tbody>
    {% for item in items %}
      <tr>
        <td>{{item.title}}</td>
        <td>{{item.requested_price}}</td>
        <td>{{item.last_price}}</td>
        <td>{{item.discount_price}}</td>
        <td>{{item.date}}</td>
      </tr>
    {% endfor %}  
    </tbody>
  </table>
  {% endblock %}  
Well, you can improve the project by adding email functionality so Django will send email about discounts. Take a look How to Send Email in a Django App
You can clone this project from my GitHub repository below
Mission Accomplished!
That's it! Make sure you are following me on social media and see you in next post soon Hackers!
Stay Connected!

Written by thedevtimeline | Sharing real experience
Published by HackerNoon on 2019/12/13