How to create social media posts from long form content using Python
This article discuss about generating social media posts from long form written content using Python. Most of the companies create technical blogs, white papers, articles as part of their marketing initiative. And they will push these content to social media with a summary, which helps them to bring customers to their websites. And most of the articles or blogs repurposed. This solution will help you to create such social media posts from the blog posts or articles. In this post you will learn how to implement a solution using Python and Flask, and hosting it on Azure App Service. You will also learn to use Azure Cognitive Services instead of using nltk package.
How it works?
The algorithm is very simple. First you will parse the URL and extract the keywords from the content using NLP. Next you will find the sentences from the content with most of the keywords and display it.
Following packages are used in this example.
|Flask||For user interface and user interactions|
|Newspaper||For getting the content from URLs or website|
|Nltk||For extracting keywords from Text and splitting content into multiple sentences|
So you need to install the above packages. Here is the requirements.txt file.
You can run the
pip install -r requirements.txt in your virtual environment. Once you install all the requirements, you can create the
app.py file. You can find the
app.py file in the implementation section. You can use VS Code for development purposes, with Docker and Azure extensions.
You can use the Flask framework to show the user interface and interact with user inputs. The newspaper package is for converting the URL into readable format and extracting the keywords from the content using Nltk package.
This implementation got one route with different HTTP methods. When a user browse the URL, the HTTP GET method is invoked and it returns a
index.html file. And when a user fill the
UrlInput field and submits the form, the HTTP POST route is invoked. In backend, you will get the value of the
UrlInput form field. Using the
Newspaper package, the URL is downloaded, parsed and running nlp on the content with the help of
nltk which helps to extract the keywords. Next using
sent_tokenize the text is split into multiple sentences. And finally, based on the number of keywords in a sentence, add the sentence into a array and render the
result.html file with the array. And the app is exposing port 5000. You can run / debug the application using VS Code.
In the next section, you will publish the solution to Azure.
Publishing to Azure
To publish the solution to Azure, let’s convert the solution into a docker image and publish it. To do this you can use VSCode Docker extension and add the Dockerfile. Once you add the Dockerfile, you will get a
requirements.txt file with
gunicorn packages. You need to add the packages you installed to this. Modify the
requirements.txt file like the following.
And here is the Dockerfile generated by VS Code.
Once it is done, run the docker build image command -
docker image build --tag anuraj/postgenerator ., you need to use your docker hub or container registry id instead of
anuraj. And once it is build, run the container with the command
docker run -d -p 5000:5000 anuraj/postgenerator and open browser and check whether our application is running. You can browse http://127.0.0.1:5000/. It will be showing the UI. Once you submit a URL, it will throw an
Internal Server Error. You can check the docker logs and it will show something like this.
To fix this issue, you need to download the
punkt resource. You can do it in the Dockerfile like this.
In the Dockerfile you’re downloading the
punkt resource to the
/app/nltk directory and configuring the
NLTK_DATA environment variable to the downloaded directory. Now build the image and run it. It should work properly. Now you build a docker image. Next you need to publish the image to any docker registry. For this example, Docker Hub is used. And the image is tagged based on the Docker Hub account. If you’re not following the convention, you need to tag the image with your id. If you’re using VS Code, you can deploy it from there with the help of Docker extension, or you can use the
docker push command like this -
docker push anuraj/postgenerator, it may take some time based on your internet bandwidth.
Once it is completed, you can check the Docker Hub and verify it is available. To deploy the image to App Service, you can use the VS Code Docker extension. You can right click on the image tag and choose the
Deploy Image to Azure App Service option.
It will prompt for the some configuration values - similar to what you configure when creating an Azure App Service. Once it is done, VS Code will provision the app service and deploy the container image to Azure app service.
You can extend the implementation using Azure Cognitive Services - Text Analytics. So instead of using
nltk package to extract the keywords you can use Azure Text Analytics service and extract the keywords. Here is the code for getting the keywords using Text Analytics from docs.microsoft.com.
Please note that Text Analytics got request data limits - Maximum number of characters for a document is 5120 and maximum number of documents is 10. So if you’re planning to get extract keywords from long documents, you may need to split the document and join the results.
Now you have implemented a minimal python AI application and deployed to Azure. You can find more details about App Service Deployment, configuring CI/CD pipelines for Python applications, Using Azure Cognitive services in Python etc in the Reference Links section.
- Tutorial: Deploy Docker containers to Azure App Service with Visual Studio Code
- Configure a Linux Python app for Azure App Service
- Quickstart: Create a Python app in Azure App Service on Linux
- Use CI/CD to deploy a Python web app to Azure App Service on Linux
- Build an AI web app by using Python and Flask
- Azure Text Analytics client library for Python - Version 5.0.0
Happy Programming :)