API Rate Limiting

rick

Follow

119

Forum Posts

57

Wiki Points

0

Followers

Reviews: 1

User Lists: 1

#1 rick

Many of you have noticed that our API rate limiting is stifling to put it mildly. We heard you and we, yet again, changed the way we limit API use. You'll like this one we're sure...

Previously:

There's a limit of 450 requests within a 15 minute window. If you go above that you're temporarily blocked. You can make all those requests within anywhere from 1 second to 15 minutes.

Now:

TL;DR: Space out your requests so AT LEAST one second passes between each and you can make requests all day. Go even a millisecond faster and you'll hit a brick wall REALLY HARD.

There is no limit of the number of requests. You are limited to how often you can make requests. There are no hard numbers in this, its more of a throttling algorithm that will restrict aggressive apps and reward those that are well behaved. If your app spreads out requests to at most one per second you will not have any problems and can make requests 24/7. If the time between requests is less than 1 second you will be restricted and the more of these requests you make the more likely you will be blocked and proceeding amounts of allowed requests will dramatically drop.

8 years ago

arkay74

Follow

101

Forum Posts

312

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#2 Edited By arkay74

Yeah, that won't work too well at first since the ComicVine Scraper does more than 1 request per file as far as I know and there's only a way to set a delay between files in ComicRack (even 5 seconds seems to be too low). So there are probably code modifications necessary until you see a better behavior at your end.

8 years ago

arkay74

Follow

101

Forum Posts

312

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#3 arkay74

Even with a 15 second delay I am stuck on my last 10 files and keep getting the limit error message.

It wouldn't have hurt to let the people using your API know beforehand that this was coming? As a software developer I find this very, very odd.

8 years ago

clearmist

Follow

37

Forum Posts

49193

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#4 clearmist

I second your opinion, arkay74. However, I sympathize with how restricted CBS is in giving their developers enough resources and time to work on comicvine.com. In that situation I understand how edgework would jump in to the latest task and try to complete it quickly without much thought to the proper way of doing things.

arkay, you are completely right about the ComicRack scraper performing multiple connections under one second. First it's the search then a scrape for the currently selected cover, third a scrape for the issue metadata (if you select the issue within one second of the initial search). cbanack said he's abandoned the project. I do not know if he'll do the required work and make his plugin wait one second between all connection requests.

8 years ago

rick

Follow

119

Forum Posts

57

Wiki Points

0

Followers

Reviews: 1

User Lists: 1

#5 rick

@arkay74: Yes and that's been a problem. Comic Vine's database usage is 3-5x that of GameSpot and GameSpot has 10x the users. The reason for this are the scrapers. They are affecting the site quite negatively. The current algorithm will allow for people to run these scrapers without affecting the rest of the site. We've got to find a balance here...

8 years ago

Marv74

Follow

7

Forum Posts

0

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#6 Marv74

SCRAPE_DELAY=15 will solve this? I got over 60,000 comics to scrap, I tried to let it all on auto during the night, only to find out the computer halted on it on about 200 comics :/

8 years ago

Hyperspacerebel

Follow

13

Forum Posts

785

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#7 Hyperspacerebel

When you say you'll hit a brick wall, does that mean if a request is made < 1 second after the last you'll return some sort of access denied response, or that the response will be delayed a second or two to keep them in line? The latter would be wonderful, work well with existing tools, and save you guys the hits. The former however, and what I'm assuming is the case, still leaves the users unable to do much until the tools are updated. And then when you guys update your policy again in 6 months, all the apps will have to be updated again. Whereas if it's on your end, you just set the delay to whatever you want, and everyone continues to automatically follow the rules.

8 years ago

Marv74

Follow

7

Forum Posts

0

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#8 Marv74

SCRAPE_DELAY=8 seens to be okay. I was using 7 as instructed somewhere here on the forum, but that was not good anymore.

8 years ago

rick

Follow

119

Forum Posts

57

Wiki Points

0

Followers

Reviews: 1

User Lists: 1

#9 rick

@hyperspacerebel: Think of this in terms of general relativity. The faster you go the more mass you'll have and the Higgs field is going to drag harder on you. The algorithm is exactly based on that. You will not be allowed to go past the cosmic speed limit. (Sorry I won't say what that is so as not to feed into gaming the system). Suffice to say >= 1 second wait in between requests and you will never have a problem.

8 years ago

Hyperspacerebel

Follow

13

Forum Posts

785

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#10 Edited By Hyperspacerebel

That is not what I'm asking. In any way shape or form.

Look, 99.99% of the people hitting your api are using a 3rd party program, and do not have control over whether the application follows your stated but not-programmatically-enforced rules. What I'm asking is if I, or any other user, who is using these applications, is going to get banned for using the programs without knowing whether they are hitting the api too many times or not? If your webserver has a basic request limiter (that any competent nginx or apache person could do in literally 2 minutes) that will automatically make sure all requests from an ip address stay within the 1 second rule, then everything is fine and dandy: all of us people using 3rd party scripts can rest easy knowing whatever limits you impose will just happen, regardless of how the app is programmed to work. If, on the other hand, you're simply tracking requests and summarily banning/restricting people who break the 1 second rule, then that's a bit of a problem for 99.99% of the people hitting your api, because they have NO IDEA whether they are breaking any rules or not (the code of the app is often gibberish to the person using it), and they won't be able to control or even know to control their usage.

I really want to stress that literally 99.99%+ of the people hitting the api are not programmers and don't know if the app they are using is breaking arbitrary rules or not. It's literally better for all parties involved if your system can proactively ratelimit things to whatever rule you want (1 r/s in this case) and not rely on the script writers to do that, who may or may not be following that rule and just pass the program and the penalties onto the users.. If that is the case already, great.

Again, I'm going to reiterate it, because I keep getting the impression that you don't understand it: when you instruct us that "Suffice to say >= 1 second wait in between requests and you will never have a problem", that is useless advice for 99.99% of the people hitting your api. They went to a website, downloaded a program, and use it to tag their comics. They have no control over the way the program was written to hit the api, and most of them wouldn't know how to fix it if it was written in a way that was hurting you. So, what systems are in place for a normal user to not get banned/restricted? How can I, as a regular user of tagging programs, ensure that I am following your rules and making you happy and am not going to be punished?

8 years ago

JohnKFisher

Follow

1

Forum Posts

23

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#11 JohnKFisher

Hyperspacerebel is 100% correct. Is there not a way to rate-limit on your side so that everything just works for everyone, INCLUDING ComicVine?

8 years ago

cbanack

Follow

124

Forum Posts

199

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#12 Edited By cbanack

@edgework said:

Suffice to say >= 1 second wait in between requests and you will never have a problem.

This does not appear to be working as you describe. I have adjusted my app to ensure that it NEVER talks to api.comicvine.com more than once every 2000 ms, and after a short while I still get blocked with the 'slow down cowboy' error message (i.e. the API accessed too often problem.) The only difference is that now I am not blocked for very long; if I wait a minute and try again, I am able to access the API again. But then a few minutes later, I am blocked again.

Several other contributors to the project have independently tried to make the same change that I made, and have had similar results (i.e. it doesn't work).

(And yes, I did search the code very carefully to make sure there isn't an API call I'm missing somewhere.)

8 years ago

cbanack

Follow

124

Forum Posts

199

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#13 Edited By cbanack

FWIW, I also agree with the other commenters in this thread; evening out the load on your server(s) is your job, not mine. Using an arbitrary timeout that you expect API users to figure out and follow (on penalty of being beaten with the ban-hammer) is a very atypical way to offer a web API.

This isn't a matter of 'badly behaved' and 'well behaved' applications. When a typical software developer tries to use a web API conscientiously, he or she is worrying about the volume of requests that are being generated, not the timing of those requests. It is generally assumed that the server will queue up requests as necessary if too many happen to come in at (nearly) the same time.

8 years ago

clearmist

Follow

37

Forum Posts

49193

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#14 clearmist

I second @hyperspacerebel's post; including his implication that @edgework is simply not reading or understanding the posts in this thread. Just look at post #5: edgeybaby replied to @arkay74's first reply, but not his second.

There are two long-term solutions: Inserting a connection delay at the web server (Apache) level or pestering Chris over at the comicbookdb.com to finally write an API.

8 years ago

arkay74

Follow

101

Forum Posts

312

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#15 arkay74

@clearmist: That's how I would have implemented it as well. The resource restrictions are on the server side and therefore should also be enforced there _in a sane way_. Otherwise the client applications will have to be modified each time the policy changes (as it has been for years now) and that just doesn't seem right.

8 years ago

rick

Follow

119

Forum Posts

57

Wiki Points

0

Followers

Reviews: 1

User Lists: 1

#16 rick

HTTP error codes 420 and 429 are meant specifically for this. (I don't know why we don't use them, that's on me I thought we did). But most API services will rate limit requests. It is up to the API user to limit themselves. For the server side to do it we'd need to keep a bunch of threads and TCP/IP connections stalled as we queue and process the requests. That's something we're not going to do...

BTW I do read your posts, I just don't really have the bandwidth to address everything specifically. I'm going to look over logs this weekend to see if there's anything we can adjust here. But the current scheme will be the permanent scheme.

8 years ago

castleage1974

Follow

1

Forum Posts

0

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#17 castleage1974

I'm not a programmer, I'm just a user, but it's fairly obvious that "the current scheme will be the permanent scheme" is PR doublespeak. It's broken now. The current scheme can't be permanent if it's broken.

8 years ago

CnCBoy

Follow

9

Forum Posts

1348

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#18 CnCBoy

@marv74: I tried with 10 and got blocked again. I set it to 15 one hour later. I will see if it works correctly.

8 years ago

arkay74

Follow

101

Forum Posts

312

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#19 Edited By arkay74

Limiting on the server side is doable, constantly asking the API users to change the software just makes no sense. Next week you are going to tell them that now it should be 2 seconds or that something else needs to be done. Taking the matters in your own hands makes it tweakable. You wouldn't tell the website visitors to only click on 3 links per minute, would you? Same thing. Limit or queue it on your side.

What do you expect to gain from the 1s delay? It is going to take us longer to get our books tagged. Hence more users will be online at the same time which--again--is going to increase your load. This isn't a solution, you are just shifting the problem. You need a good strategy and not trial-and-error development. Load & performance tests don't hurt.

8 years ago

Marv74

Follow

7

Forum Posts

0

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#20 Marv74

@cncboy: I tried 7, 8, 10, 12, 15, 25... all eventually fail. Don't know what to do anymore, this was a VERY useful resource and now it's ruined...

8 years ago

cbanack

Follow

124

Forum Posts

199

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#21 cbanack

@marv74: @cncboy: Changing the scrape delay is not going to help you, because that is a per-comic delay, not a per-api-request delay. Just be patient. There will be a new version of Comic Vine Scraper that does not violate the new 1 second rule. I'm just waiting to hear back from @edgework about a bug that I am experiencing with it, and once it is working properly I'll release it for everyone. Keep an eye out for it over on the ComicRack forum.

8 years ago

Marv74

Follow

7

Forum Posts

0

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#22 Marv74

@cbanack thanks!

8 years ago

RoboMan

Follow

2

Forum Posts

0

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#23 RoboMan

Regardless of the rate limitations imposed, these limitations need to work. Although the comicvine scraper may well exceed the 1 request/sec limit imposed, the implementation of the rate limitation appears fundamentally broken.

Testing using a simple C# app implies that keys are arbitrarily blocked if a non-specified amount of activity is recorded, and blocked on specific requests rather than generally.

e.g. Calling the request http://www.comicvine.com/api/volume/4050-XXXX with a measured 4 second gap between requests resulted in a persistent 'slow down cowboy error' -- this error only appears on this call -- API requests for issue or issues are unaffected.

8 years ago

RoboMan

Follow

2

Forum Posts

0

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#24 RoboMan

New limits are MUCH more restrictive than previously. It appears that the rate limiting of 1/sec, now has an hourly limit of 200 requests per specific API resource. The hourly limit only resets after 60 minutes of inactivity on all API resources. i.e. single call to Issues will reset the waiting period for any resources that have been used to reach a zero counter.

This is definitely not what has been stated above. Although comicvine have the right to restrict access as they see fit, they should clearly state what these restrictions are.

8 years ago

CnCBoy

Follow

9

Forum Posts

1348

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#25 CnCBoy

@cbanack: Thank for the info and for the work you put on this. I used API in my daily job and I know it can be really frustrating sometimes.

8 years ago

rick

Follow

119

Forum Posts

57

Wiki Points

0

Followers

Reviews: 1

User Lists: 1

#26 Edited By rick

@cbanack @cncboy@robomanetc

Sorry... At first I was all

...and then, while trying to prove a point I found out that a developer (who I won't name so you guys don't string him up) left a debug value of 5 seconds as the minimum space in between requests, then I was all

We'll get fix live on Monday. Sorry we can't do it now, there's a code-freeze this week,

...and I'm sure many of you now are all

but when this goes live everything will be

and you'll all be

Just remember, There are no bugs in the Comic Vine API

8 years ago

theotherjasonf

Follow

1

Forum Posts

10203

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#27 theotherjasonf

Great news! Thanks @edgework for taking the time to take a close look.

8 years ago

CnCBoy

Follow

9

Forum Posts

1348

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#28 CnCBoy

@edgework: There is no bug, only undocumented features. :-)

I have a subgestion about the API. In an application of mine, we had a method in the API returning a version number.

The implementing program could check the version to see if it still matched the hardcode version value they expect to use. Of course the api user developper could choice to not implement a check of the API version, but he choose to do it, it can offer to the user to stop using the program or do it anyways. An API version check leave a chance to the api user to avoid attacking an API the wrong ways.(Sorry if I butchered english a little, it isn't my prime language).

8 years ago

fieldhouse

Follow

3

Forum Posts

1121

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

8 years ago

119

Forum Posts

57

Wiki Points

0

Followers

Reviews: 1

User Lists: 1

#30 rick

Uh, the fix didn't make it into today's release but it is for sure queued up for tomorrow. Please refrain from stalking me over this...

8 years ago

Marv74

Follow

7

Forum Posts

0

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#32 Marv74

And I still get the message and can't scrape... what's up?

8 years ago

solidus0079

Follow

11

Forum Posts

174

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#33 solidus0079

@marv74 said:

And I still get the message and can't scrape... what's up?

Are you using the latest Comicvine Scraper version? It was updated recently to work with these API changes.

8 years ago

tglass1976

Follow

15

Forum Posts

854

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#34 tglass1976

@marv74 said:

And I still get the message and can't scrape... what's up?

Did you install the new version of CVS that was released over the weekend?

8 years ago

Marv74

Follow

7

Forum Posts

0

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#35 Edited By Marv74

@tglass1976@solidus0079Yes, i got it already. Oddly, over the night it went WAY over 200% and didn't stopped... any relaxed rules for late night scrapping?

8 years ago

CnCBoy

Follow

9

Forum Posts

1348

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#36 CnCBoy

@marv74: I asked myself the same question. I check the API stats and stop when I see I reach the limit since I suppose I am expected to stop. Nothing appear to stop the scraping here.

8 years ago

Marv74

Follow

7

Forum Posts

0

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#37 Marv74

@cncboy: That's fine by me, I let the computer on all nite and got near 6000 scraps done when I woke up. Not bad :D

8 years ago

brokenxwing

Follow

1

Forum Posts

0

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#38 Edited By brokenxwing

@rick: I realize this was a long time ago when you posted this, but this seems to be not true anymore. Also the bans seem insanely long. Like at least 24 hours if I go over too many requests in a second. I've waited nearly 24 hours already and still NOTHING. Also my API limit says it's fine that there's no limit so what's up with that? Why are you guys becoming so restrictive with this stuff? Over the last few years it seems you've gotten more and more restrictive. What's the deal?

Why are you guys so hell bent on limiting the ability of users to EFFICIENTLY make use of your database? I mean this guys program is amazing. It's abilities are just unparalleled I'e tried manual copying of information from the site. Do you have ANY idea how bad that is? It took HOURS to update less than 100 issues. Like were talking half a day maybe. And it wasn't even able to get all the information because it was too much work to do it.

Not to mention it would require me to literally open EVERY SINGLE issue in my browser, one by one, then I'd have to go through the atrocious pages trying to copy the titles which are hyperlinks and therefore not as easy to copy and past. I just don't understand, why do you even have a database like this if you aren't going to allow the people who want to USE IT do to so in the best ways?

I'm completely ignorant on APIs. No essentially nothing about them, but from reading the thread you had with the creator of CVS it seemed you had little interest in making your API better let alone allowing more access to it. Why is it the consumers job to worry about hitting your arbitrary new API limits that get more and more restrictive and less and less able to be effectively used. The fact that you haven't even made ANY new posts in over a year about your updates is disconcerting as well...

So my point I guess was what is the current API limit, how long is the ban if you exceed it, and why is it considered "malicious" in the first place if you accidentally do so? I was fine when I was using CVS but I used a different program for a bit called ComicTagger cause it's a better at remembering the thumbnails of of covers after they're loaded once and is better at changing the file names based on the metadata of the files. But it seems to not have a setting for limiting the number of scraps to at least 1 per second. It seemed it got me banned MUCH quicker and for 24 hours at least. Is there a way to get this ban removed now?

7 years ago

pikahyper

Follow

19027

Forum Posts

37057995

Wiki Points

0

Followers

Reviews: 5

User Lists: 581

#39 pikahyper Moderator

@brokenxwing: rick hasn't worked here for a very long time, he is no longer employed by CBSi.

7 years ago

imawindev

Follow

6

Forum Posts

0

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#40 imawindev

@pikahyper: But who does then? Radio silence for months :(

7 years ago

pikahyper

Follow

19027

Forum Posts

37057995

Wiki Points

0

Followers

Reviews: 5

User Lists: 581

#41 pikahyper Moderator

@imawindev: There's really only one engineer left and he has to work on multiple sites.

7 years ago

imawindev

Follow

6

Forum Posts

0

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#42 imawindev

@pikahyper: Thanks. So the API is abandoned. It doesn't make any sense to use it for my new project. Too bad!

7 years ago

pikahyper

Follow

19027

Forum Posts

37057995

Wiki Points

0

Followers

Reviews: 5

User Lists: 581

#43 pikahyper Moderator

@imawindev: oh it's not abandoned, most of the people that use the API don't seem to realize that this is a wiki, the site itself is the priority, the API is just something that we offer so that developers can utilize the wealth of data that is available on the site, it is offered as is though. Right now the engineers might be short handed but they are still working on the site, there is a new version of the wiki platform in the works and it is all new, from scratch I believe, but since they are short handed it is taking longer. For Comicvine specifically the new platform and keeping the site running smoothly is the priority but I know they still have more plans for the API we just don't know when that will happen unfortunately and for now the engineers are focused on their work instead of chatting on the site so that's a good thing, hopefully they will finally hire some more engineers though and the updates can speed back up.

7 years ago

imawindev

Follow

6

Forum Posts

0

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#44 Edited By imawindev

@pikahyper:Thanks for the insight! The main problem is that the rate limiting is way to restrictive and there is no current information on how it works exactly and no answers to questions about it.

The following is what I would like to do but I have no idea if this is okay with the current rate limiting. According to postings in this forum it's not but this was a long time ago so I just don't know. Here we go:

I need to get the details for all issues in a volume. Sometimes the volume contains just one issue, but most times it's way more, e. g. The Waling Dead has 164 issues. First, rate limiting won't let me call /api/issue/ 164 times at once and second, if there has to be a delay between the calls it will take forever. How am I supposed to do this?

7 years ago

pikahyper

Follow

19027

Forum Posts

37057995

Wiki Points

0

Followers

Reviews: 5

User Lists: 581

#45 pikahyper Moderator

@imawindev: I have no experience with API's but from what was laid out in the original post it seems like you just have to delay the requests so you don't have more then one per second, that doesn't seem like much of a problem

7 years ago

imawindev

Follow

6

Forum Posts

0

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#46 imawindev

@pikahyper: But 164 calls will almost use up the rate limiting of 200 calls per hour and adding a delay of one second means getting the details for all issues will take almost 3 minutes. No user will accept that.

7 years ago

pikahyper

Follow

19027

Forum Posts

37057995

Wiki Points

0

Followers

Reviews: 5

User Lists: 581

#47 Edited By pikahyper Moderator

@imawindev: three minutes isn't that bad for that amount of data, the API is a free resource and users of these third party apps need to be grateful for what can be given and use it in moderation as we don't have the resources to allow more than is available, without these limitations the API calls take down the entire site regularly because too many third party users take the service for granted and flood the API with requests, we did it for a little while and it sucked.

7 years ago

imawindev

Follow

6

Forum Posts

0

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#48 Edited By imawindev

@pikahyper: A simple solution would be to add a method to the API which gets the details for all issues in a volume in a single request.

7 years ago

pikahyper

Follow

19027

Forum Posts

37057995

Wiki Points

0

Followers

Reviews: 5

User Lists: 581

#49 Edited By pikahyper Moderator

@imawindev: It wouldn't be ideal though as it would run into time out problems, we have volumes with thousands of issues in them. In its current form the API is semi basic and can't handle complex or time/resource intensive operations, it is still an information wiki, third party apps have just chosen to use it as the backbone for their collection applications (as free alternatives for pricey collection apps that can afford to have more resources), the API may be able to handle small collections but it can't handle large ones. I doubt the API was developed with collection software in mind, let alone multiple applications:

Now that you've found this motherload, do the right thing. Don't just steal it and build some crappy collection app.
- from the API page

7 years ago

jun6lee

Follow

1

Forum Posts

50

Wiki Points

0

Followers

Reviews: 0

User Lists: 0

#50 jun6lee

Am I blocked? Been adding a lot to Mylar3, so plausible.

3 years ago

Move topic to another board