OK, no promises. We'll try to add stuff to the API that may be missing. The simpler the request the more likely it will be added. If you've got a request, reply here. Please be concise. Tell us what entity you want what data added to. Also give is a sample use case if your request isn't 100% evident. Psuedo code is best. Too much English will end up being TL;DR, we all have ADHD here.
What's missing from the API?
The easiest addition to the api to get the CVS off of scraping the html is to add two new fields to the issue query:
- image_alternates - a list of urls to all alternate images attached to the issue (e.g. The 13 images all stacked up on the right side of this issue page)
- user_review_average - a float specifying the average review score for the issue (e.g. 4.25 on this issue page)
There's other user review info that could be cool to have access to, however that specifically is not needed with the currently built CVS as I understand it.
Hopefully someone else has looked at the plugin code and can confirm those are the two pieces of data we'd need. I've looked through most of the codebase and I've got a fairly good grasp of it, but I could easily have overlooked something.
Yes, @hyperspacerebel, you are correct.
If the API provided those to fields of data in the 'issue' entity, it would allow the '__issue_scrape_extra_details' function in cvdb.py to be removed, and that is the only place where Comic Vine Scraper scrapes HTML directly. Essentially, CVS could stop scraping HTML with no loss of functionality at all.
Also note that the 'user_review_average' is not nearly as important as the 'image_alternatives'. 'Image_alternatives' is used for automatic cover matching and manually displaying additional covers. This is one of the centerpiece functionalities of the scraper--something that a LOT of people would really miss. The user_review_average is only for scraping one little, optional field and would not be missed nearly so much.
Thinking across the board for the widest number of API uses, the main thing missing is a weekly release based resource. If you had a release_date resource an API dedicated to accessing information about new comics would hardly ever have to use any other resource. Searching for volume and issue data would become unnecessary because you are starting from specific information.
What should be included in such a resource is open for debate, but consider it from the perspective of an application telling users which comics are on the shelves on any particular week, and which info they would need to make a decision to purchase it and you couldn't go far wrong.
Obviously, for a mainly crowd sourced database, the information on upcoming issues would be limited but usually there is pretty good data at the beginning of the week on your database.
A great number of applications are mainly interested in current material, and anything that puts that info in the hands of developers in a paged list format would be a godsend.
There are two things I've noticed with the person resources:
The 1st issue is a bug. Query the resource of a person who is dead (Jack Kirby will do), and include the death date. If you're querying XML, then the death element will be absent (for people who are still alive, then it will be present, but empty). This only happens for XML - JSON works fine. See www.comicvine.com/api/person/4040-5614/?api_key=1234abcd&format=xml&field_list=birth,death,name for an example, after replacing the API key with a real one.
The 2nd is that while the HTML includes the twitter handle of each person, this information is not available through the API.
1. ISBN would be so super cool!
2. Variant cover would be very cool.
3. Upcoming releases next week/this week would be nice.
The API is really powerful and I think a few additional things could make it even more awesome:
- Buy links (linking to Amazon / Comixology)
- Upcoming weekly releases (Have to reply on in_store_date or added _to_api right now)
- Add 'mature / adult' tag to adult content. This makes it easier for us to sort / filter out this kind of content if we need to.
I would like a resolution-independent hash code with all images. That -- along with the hash algorithm used -- would allow one to identify issues just by their cover image. There are ~530,000 issues in your database. In order to avoid querying for such a field on these issues (bad idea!) I sure would like to download a single file of 20MB (or so) with the issue identifier and the image hash. That would make it extremely easy for software to tag comics correctly and automatically.
I don't know if anyone is still monitoring this forum but I really could use the ability to sort by start_year under volumes. It's crazy to me that I can't do that for certain series like "X-Men." There are currently 1026 results. If I could filter by start_desc, I would easily be able to find "X-Men Blue" for example which started this year without knowing the exact name.
Making the API follow the REST standard and allow for more precise searches with LHS Brackets.
For example if I search for the volume of name "Blast" I also get volumes of "StarBlast" that I don't want.
I also tried to filter my volumes by publisher but no luck either with the id nor the name.
I was still very glad to find this free api for my personal project so thank you :)
It would be darned great if the characters query could sort or filter by count_of_issue_appearances. Similar for series and publishers. My goal here: I want to easily pull out a relatively small list of the most "popular" characters/publishers/series. There are many, many, many rows for these 3 queries, 99% of which are low-traffic properties; with a hard limit of 100 rows, even paging with the offset would be a nightmare to grab all rows and then filter post query.
I notice that the wiki result pages can sort by appearances. How about the API?
Please Log In to post.