Skip to content

My Google Reader alternative using windows azure mobile services. Part 3

Now that my blogging engine is back on track it’s time for the long awaited third part of my Google Reader alternative series and time to show you how I implemented the server side code to read news for subscriptions I stored in SQL Azure in part 2.

Since Windows Azure Mobile Services allows you to schedule and run recurring jobs in your backend, I thought that my Google Reader service could define some server script code that would be executed based on a schedule I defined and could download the RSS entries for each of my subscriptions. So I’ve created a new scheduled job by clicking the Scheduler tab, then +Create button at the bottom of the page.

scheduled

In the scheduler dialog, I entered DownloadPosts as the Job Name, I set the schedule interval and units, and then clicked the check button.

Once the job was created I clicked the script tab at the top in order to write my business logic to fetch the RSS feeds.

image

The code I wrote was pretty simple, for each subscription stored in my Feeds table I called the url stored in the FeedUrl field of the Feed entry.

function DownloadPosts() {
    console.log("Running post import.");
    var feedsTable = tables.getTable("feed");
    feedsTable.read({
        success: function(results) {
            results.forEach(function(feed){
                importPosts(feed);
            });
        }});
}
The importPosts function was responsible for making the actual http call and storing the data to the posts table.
function importPosts(feed) { 
    var req = require(''request''); 
    console.log("Requesting item ''%j''.", feed); 
    req.get({
            url: "" 
        }, 
        function(error, result, body) { 
            //1. Parse body
            //2. Store in the database
        } 
    ); 
}

Unfortunately Windows Azure Mobile Services do not provide an XML node module yet (according to the team they’re planning to put more modules in the near future) that could parse the xml formed RSS feed returned by the subscriptions, there is only a JSON module available to consume Json feeds. So at this point I had two options, I had to either move my post retrieval logic to my client or build a proxy to convert the xml RSS feed to a Json one. I decided to go with the second one because of two reasons.

  • If I moved the download logic to the client then this would have to be replicated to each client I used.
  • Parsing the xml feed server side is going to be available in the near future so there is no point in replicating the code, it’s going to be relatively easy to replace the proxy and actually parse the feed in the scheduled job when the XML node module is released.

So I built a simple CloudService which contains a very basic web role to convert RSS feeds to Json ones. The Web role contains an HTTP Handler that is passed a Feed ID as a querystring argument. It then looks up the ID in the Subscriptions Table finds the Feed Url and calls it to get the xml data. It then uses the Newtonsoft.Json object serializer to serialize objects that contain the data from the parsed RSS feeds. The code looks like this:

using System;
using System.Collections.Generic;
using System.Net;
using System.ServiceModel.Syndication;
using System.Web;
using System.Linq;
using System.Xml.Linq;
using CloudReader.Model;
using System.Diagnostics;

namespace XmlToJsonApi
{
    public class Convert : IHttpHandler
    {
        const string CONTENT_NAMESPACE = "http://purl.org/rss/1.0/modules/content/";
        const string MOBILESERVICE_ENDPOINT = "https://cloudreader.azure-mobile.net/tables/feed/{0}";

        public int FeedID
        {
            get 
            { 
                return GetFromQueryString("id", 0); 
            }
        }

        /// <summary>
        /// You will need to configure this handler in the Web.config file of your 
        /// web and register it with IIS before being able to use it. For more information
        /// see the following link: http://go.microsoft.com/?linkid=8101007
        /// </summary>
        #region IHttpHandler Members

        public bool IsReusable
        {
            // Return false in case your Managed Handler cannot be reused for another request.
            // Usually this would be false in case you have some state information preserved per request.
            get { return false; }
        }

        public void ProcessRequest(HttpContext context)
        {
            SyndicationFeed feed = GetFeed(FeedID);

            var datafeed = ParseFeed(feed);

            string data = Newtonsoft.Json.JsonConvert.SerializeObject(datafeed.Select(pe => new 
                {
                    Author = pe.Author,
                    CommentsCount = 0,
                    Description = pe.Description,
                    Content = pe.Content,
                    FeedId = pe.FeedId,
                    IsRead = pe.IsRead,
                    Link = pe.Link,
                    PubDate = pe.PubDate,
                    Stared = pe.Stared,
                    Title = pe.Title
                }), Newtonsoft.Json.Formatting.Indented);

            context.Response.ContentType = "application/json";
            context.Response.Write(data);
        }

        private List<Post> ParseFeed(SyndicationFeed feed)
        {
            List<Post> posts = new List<Post>();
            if (feed != null)
            {
                // Use the feed   
                foreach (SyndicationItem item in feed.Items)
                {
                    try
                    {
                        Post feedItem = new Post()
                        {
                            Author = string.Join(", ", item.Authors.Select(sp => sp.Name)),
                            CommentsCount = 0,
                            Description = item.Summary != null ? item.Summary.Text : "",
                            Content = GetContent(item),
                            FeedId = FeedID,
                            IsRead = false,
                            Link = item.Links.Select(l => l.Uri.AbsoluteUri).FirstOrDefault(),
                            PubDate = item.PublishDate.UtcDateTime,
                            Stared = false,
                            Title = item.Title.Text
                        };
                        posts.Add(feedItem);
                    }
                    catch (Exception exception)
                    {
                        HandleException(exception);
                    }
                }
            }
            return posts;
        }

        private string GetContent(SyndicationItem item)
        {
            string content = null;
            if (item.Content != null)
            {
                content = item.Content.ToString();
            }
            else if (item.ElementExtensions.Where(e => e.OuterNamespace == CONTENT_NAMESPACE && e.OuterName == "encoded").Count() > 0)
            {
                var elem = item.ElementExtensions.Where(e => e.OuterNamespace == CONTENT_NAMESPACE && e.OuterName == "encoded").FirstOrDefault();
                if (elem != null)
                {
                    content = elem.GetObject<string>();
                }
            }
            return content;
        }

        private SyndicationFeed GetFeed(int feedId)
        {
            try
            {
                using (WebClient client = new WebClient())
                {
                    var buffer = client.DownloadString(string.Format(MOBILESERVICE_ENDPOINT, feedId));
                    Feed feed = Newtonsoft.Json.JsonConvert.DeserializeObject<Feed>(buffer);
                    XDocument doc = XDocument.Load(feed.XmlUrl);
                    return SyndicationFeed.Load(doc.CreateReader());
                }
            }
            catch (Exception ex)
            {
                HandleException(ex);
            }
            return null;
        }

        private void HandleException(Exception ex)
        {
            Trace.WriteLine(ex.Message);
            Trace.WriteLine(ex.StackTrace);
        }

        public int GetFromQueryString(string requestParameter, int defaultValue)
        {
            string value = HttpContext.Current.Request.QueryString[requestParameter];

            if (!String.IsNullOrEmpty(value))
            {
                int iValue = -1;

                if (Int32.TryParse(value, out iValue))
                {
                    return iValue;
                }
            }

            return defaultValue;
        }
        #endregion
    }
}

As you can see the handler uses the REST API of my Mobile Service to access the Feed, there’s no secret key involved or any MobileService managed API call, I’m just calling https://cloudreader.azure-mobile.net/tables/feed/{0} passing the ID and I’m getting back a Json serialized Feed Object. In order for this to work I had to change the security permissions on the Feed table and allow Read access to everyone.

image

Once I deployed and tested my proxy cloud service I revisited my Windows Azure Mobile Service scheduled job to complete it. Now the scheduled job calls the proxy for each Feed to convert the RSS feed to a json one and then if the post is not already in the database inserts it. The final code looks like this:

function importPosts(feed) { 
    var req = require(''request''); 
    console.log("Requesting item ''%j''.", feed); 
    req.get({ 
            url: "http://cloudrssreader.cloudapp.net/xmltojson.ashx?id=" + feed.id, 
            headers: {accept : ''application/json''} 
        }, 
        function(error, result, body) { 
            var json = JSON.parse(body); 
            json.forEach(function(post) { 
                var postsTable = tables.getTable("post");
                postsTable.where({Link : post.Link}).read({success : function(existingPosts){
                    if (existingPosts.length === 0){
                        console.log("Inserting item ''%j''.", post); 
                        postsTable.insert(post);
                    }
                }});
            }); 
        } 
    ); 
}

Once I had my scheduled job enabled, I had new posts being stored in my database every hour ready to be presented in my clients.

Stay tuned to read, in part 4 of this series how I’ve consumed my posts in the Windows 8 client I’ve talked about in part 1.

Published inUncategorized

8 Comments

  1. Great web site. Lots of useful information here.
    I am sending it to several buddies ans also sharing in
    delicious. And obviously, thanks to your sweat!

Leave a Reply to Maynard Branscomb Cancel reply

Your email address will not be published.