Tag Archives: Data API

Getting Started with YouTube Java API

via Getting Started with YouTube Java API | Java Code Geeks.

In this tutorial I am taking a look at Google’s YouTube API which allows you to empower your application with YouTube’s features. YouTube is one of the “killer” Internet applications and its traffic comprises of a huge portion of the total internet traffic.

Before we get started, make sure you have read the API Overview Guide. We will mainly deal with the Data API, which allows you to perform many of the operations available on the YouTube website (search for videos, retrieve standard feeds, see related content etc.).

The API is available in multiple programming languages and we will be using Java for this tutorial. Read the Java Developer’s Guide to get a first idea. Also bookmark the Google Data API JavaDoc page.

Let’s prepare the development environment. First, download the GData Java Client from the corresponding download section. I will be using the 1.41.2 version for this tutorial. Note that there is also a version 2, but according to the site is experimental and not compatible with version 1.

Goodbye MongoDB, Hello PostgreSQL

via Goodbye MongoDB, Hello PostgreSQL.

Olery was founded almost 5 years ago. What started out as a single product (Olery Reputation) developed by a Ruby development agency grew into a set of different products and many different applications as the years passed. Today we have not only Reputation as a product but also Olery Feedback, the Hotel Review Data API, widgets that can be embedded on a website and more products/services in the near future.

We’ve also grown considerably when it comes to the amount of applications. Today we deploy over 25 different applications (all Ruby), some of these are web applications (Rails or Sinatra) but most are background processing applications.

While we can be extremely proud of what we have achieved so far there was always something lurking in the dark: our primary database. From the start of Olery we’ve had a database setup that involved MySQL for crucial data (users, contracts, etc) and MongoDB for storing reviews and similar data (essentially the data we can easily retrieve in case of data loss). While this setup served us well initially we began experiencing various problems as we grew, in particular with MongoDB. Some of these problems were due to the way applications interacted with the database, some were due to the database itself.

For example, at some point in time we had to remove about a million documents from MongoDB and then re-insert them later on. The result of this process was that the database went in a near total lockdown for several hours, resulting in degraded performance. It wasn’t until we performed a database repair (using MongoDB’s repairDatabase command). This repair itself also took hours to complete due to the size of the database.

In another instance we noticed degraded performance of our applications and managed to trace it to our MongoDB cluster. However, upon further inspection we were unable to find the actual cause of the problem. No matter what metrics we installed, tools we used or commands we ran we couldn’t find the cause. It wasn’t until we replaced the primaries of the cluster that performance returned back to normal.

These are just two examples, we’ve had numerous cases like this over time. The core problem here wasn’t just that our database was acting up, but also that whenever we’d look into it there was absolutely no indication as to what was causing the problem.