The back end will have to work by scraping, for now. If it's properly modular, we can replace it if-and-when a usable API comes along. We should use scraping for everything and not a hybrid of scraping and teh old API: a new API would use a different protocol from the old one, so we don't gain much by using the old one in the meantime.
There is a
port of Beautiful Soup to Java, which will make the parsing much easier.
Dealing with comments is a big question, and I think we can leave it until later iterations.
Important things it must be able to do:
- Log in, and store the cookies
- Get lists of: your friends, your subscription and access filters, accounts in your circle
- Read timelines of various kinds
- Post entries
The timelines present a particularly interesting challenge. It's a shame that RSS/Atom feeds aren't a general solution because they're not available for /read pages. But we can request timelines in slices (?skip=...) and that will be enough.
We should never try to download all the entries of any timeline. Instead, when the user gets to the bottom of a timeline, if we don't know whether there are more entries, we should check then. (Like the way the Tumblr and Facebook clients work.)
I've been toying with putting together a Python prototype of the back end. It would allow us to debug it more easily.
Thoughts?