2014-04-22 2 views
1

Heroku에서 호스트되는 응용 프로그램의 프로세스에 문제가 있습니다 (2xweb + 1x worder dynos). 프로세스가 끝나고 SendGrid를 통해 이메일이 생성되어 전송됩니다. 그것은 내가 문제를 해결할 것이라고 생각 노동자로 리팩토링 그래서 이것은 웹 작업자 시간 제한과 나쁜 유용성을 일으키는 시간이 좀 걸렸지 만, 나는이 같은 상황을 얻고있다 :작업자에 의한 Heroku 웹 작업자 시간 제한

Apr 10 17:12:48 wc heroku/web.2: Processing by DealsController#show as */* 
[request is processed] 
Apr 10 17:12:50 wc app/worker.1: [worker sending emails] 
[a lot of lines with debug data cut] 
Apr 10 17:12:53 wc heroku/worker.1: source=worker.1 dyno=heroku.16079351.858a3455-0b9f-4f75-9052-b419d4653703 sample#load_avg_1m=0.29 sample#load_avg_5m=0.07 sample#load_avg_15m=0.02 
Apr 10 17:12:53 wc heroku/worker.1: source=worker.1 dyno=heroku.16079351.858a3455-0b9f-4f75-9052-b419d4653703 sample#memory_total=240.45MB sample#memory_rss=240.34MB sample#memory_cache=0.11MB sample#memory_swap=0.00MB sample#memory_pgpgin=85990pages sample#memory_pgpgout=24436pages 
Apr 10 17:12:53 wc heroku/web.2: source=web.2 dyno=heroku.16079351.879182ef-13f0-4908-bf35-c487ccab6153 sample#load_avg_1m=0.00 sample#load_avg_5m=0.01 sample#load_avg_15m=0.01 
Apr 10 17:12:54 wc heroku/web.2: source=web.2 dyno=heroku.16079351.879182ef-13f0-4908-bf35-c487ccab6153 sample#memory_total=844.16MB sample#memory_rss=511.82MB sample#memory_cache=0.00MB sample#memory_swap=332.34MB sample#memory_pgpgin=223581pages sample#memory_pgpgout=92554pages 
Apr 10 17:12:54 wc heroku/web.2: Process running mem=844M(164.9%) 
Apr 10 17:12:54 wc heroku/web.2: Error R14 (Memory quota exceeded) 
Apr 10 17:12:54 wc app/web.2: ** [NewRelic][04/10/14 15:12:54 +0000 879182ef-13f0-4908-bf35-c487ccab6153 (468)] INFO : Starting Agent shutdown 
Apr 10 17:12:55 wc app/worker.1: ** [Bugsnag] Bugsnag exception handler 1.6.2 ready, api_key=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 
[8 new relic lines cut] 
Apr 10 17:12:56 wc app/heroku-postgres: source=HEROKU_POSTGRESQL_WHITE sample#current_transaction=144542 sample#db_size=189720760bytes sample#tables=92 sample#active-connections=12 sample#waiting-connections=0 sample#index-cache-hit-rate=0.99981 sample#table-cache-hit-rate=0.99868 sample#load-avg-1m=0.36 sample#load-avg-5m=0.3 sample#load-avg-15m=0.285 sample#read-iops=38.367 sample#write-iops=13.221 sample#memory-total=7629464kB sample#memory-free=187884kB sample#memory-cached=6599816kB sample#memory-postgres=689216kB 
Apr 10 17:12:56 wc app/worker.1: ** [NewRelic][04/10/14 15:12:55 +0000 858a3455-0b9f-4f75-9052-b419d4653703 (99)] INFO : Installing DelayedJob instrumentation hooks 
[11 new relic lines cut] 
Apr 10 17:12:59 wc app/worker.1: Delayed::Backend::ActiveRecord::Job Load (1.6ms) UPDATE "delayed_jobs" SET locked_at = '2014-04-10 15:12:59.482734', locked_by = 'host:858a3455-0b9f-4f75-9052-b419d4653703 pid:2' WHERE id IN (SELECT id FROM "delayed_jobs" WHERE ((run_at <= '2014-04-10 15:12:59.481969' AND (locked_at IS NULL OR locked_at < '2014-04-10 11:12:59.482002') OR locked_by = 'host:858a3455-0b9f-4f75-9052-b419d4653703 pid:2') AND failed_at IS NULL) ORDER BY priority ASC, run_at ASC LIMIT 1 FOR UPDATE) RETURNING * 
Apr 10 17:12:59 wc app/worker.1: ** [NewRelic][04/10/14 15:12:59 +0000 858a3455-0b9f-4f75-9052-b419d4653703 (99)] INFO : Starting Agent shutdown 
Apr 10 17:13:09 wc app/worker.1: Delayed::Backend::ActiveRecord::Job Load (1.6ms) UPDATE "delayed_jobs" SET locked_at = '2014-04-10 15:13:09.486147', locked_by = 'host:858a3455-0b9f-4f75-9052-b419d4653703 pid:2' WHERE id IN (SELECT id FROM "delayed_jobs" WHERE ((run_at <= '2014-04-10 15:13:09.485453' AND (locked_at IS NULL OR locked_at < '2014-04-10 11:13:09.485486') OR locked_by = 'host:858a3455-0b9f-4f75-9052-b419d4653703 pid:2') AND failed_at IS NULL) ORDER BY priority ASC, run_at ASC LIMIT 1 FOR UPDATE) RETURNING * 
Apr 10 17:13:13 wc heroku/web.2: source=web.2 dyno=heroku.16079351.879182ef-13f0-4908-bf35-c487ccab6153 sample#load_avg_1m=0.00 sample#load_avg_5m=0.01 sample#load_avg_15m=0.01 
Apr 10 17:13:14 wc heroku/web.2: source=web.2 dyno=heroku.16079351.879182ef-13f0-4908-bf35-c487ccab6153 sample#memory_total=463.75MB sample#memory_rss=153.29MB sample#memory_cache=0.00MB sample#memory_swap=310.46MB sample#memory_pgpgin=245188pages sample#memory_pgpgout=205946pages 
Apr 10 17:13:14 wc app/web.2: Started GET "/user" for [IP.IP.IP.IP] at 2014-04-10 15:13:13 +0000 
Apr 10 17:13:14 wc heroku/worker.1: source=worker.1 dyno=heroku.16079351.858a3455-0b9f-4f75-9052-b419d4653703 sample#load_avg_1m=0.21 sample#load_avg_5m=0.06 sample#load_avg_15m=0.02 
Apr 10 17:13:14 wc heroku/worker.1: source=worker.1 dyno=heroku.16079351.858a3455-0b9f-4f75-9052-b419d4653703 sample#memory_total=157.25MB sample#memory_rss=157.14MB sample#memory_cache=0.11MB sample#memory_swap=0.00MB sample#memory_pgpgin=103179pages sample#memory_pgpgout=62923pages 
Apr 10 17:13:16 wc app/web.2: Started GET "/user" for [IP.IP.IP.IP] at 2014-04-10 15:13:16 +0000 
Apr 10 17:13:17 wc heroku/router: at=error code=H12 desc="Request timeout" method=GET path=/pages/planning host=www.cool-app.com request_id=c62a7ee5-11d8-4286-846a-a55861cc6a0e fwd="[IP.IP.IP.IP]" dyno=web.2 connect=2ms service=30000ms status=503 bytes=0 
Apr 10 17:13:19 wc app/web.2: E, [2014-04-10T15:13:18.948990 #2] ERROR -- : worker=1 PID:12 timeout (31s > 30s), killing 
Apr 10 17:13:19 wc app/worker.1: Delayed::Backend::ActiveRecord::Job Load (1.5ms) UPDATE "delayed_jobs" SET locked_at = '2014-04-10 15:13:19.489494', locked_by = 'host:858a3455-0b9f-4f75-9052-b419d4653703 pid:2' WHERE id IN (SELECT id FROM "delayed_jobs" WHERE ((run_at <= '2014-04-10 15:13:19.488845' AND (locked_at IS NULL OR locked_at < '2014-04-10 11:13:19.488874') OR locked_by = 'host:858a3455-0b9f-4f75-9052-b419d4653703 pid:2') AND failed_at IS NULL) ORDER BY priority ASC, run_at ASC LIMIT 1 FOR UPDATE) RETURNING * 
Apr 10 17:13:20 wc app/web.2: E, [2014-04-10T15:13:19.689336 #2] ERROR -- : reaped #<Process::Status: pid 12 SIGKILL (signal 9)> worker=1 
Apr 10 17:13:21 wc app/web.2: Disconnected from ActiveRecord 
Apr 10 17:13:27 wc app/web.2: Processing by UsersController#show as HTML 
[web worker 2 works properly from now on] 

web.2 노동자 여전히 충돌하여 사용자가 ~ 20 초 동안 멈추고 "응용 프로그램 오류"화면이 표시됩니다. 이상한 것은 다른 페이지에서 발생하며 백그라운드에서 작업자와 연결되어있는 것 같습니다.

특히 저 (과 충돌 아마 증상이있는 사람을) 혼란입니다 라인 :

Apr 10 17:13:19 wc app/web.2: E, [2014-04-10T15:13:18.948990 #2] ERROR -- : worker=1 PID:12 timeout (31s > 30s), killing 

은 무엇을 의미 하는가은? 작업자 = 1이 타임 아웃을했기 때문에 web.2 dyno가 죽는 것 같습니다. 조금 미쳤습니다.

동력계의 구성은 다음과 같습니다 Dyno config

Dynos: 
1x - 2 - web bundle exec unicorn -p $PORT -c ./config/unicorn.rb 
1x - 1 - worker bundle exec rake jobs:work 

어떤 아이디어?

+0

유니콘을 사용하고 있습니까?이 경우 유니버설 작업자를 말하는 '근로자'가 있습니까? – catsby

+0

@catsby 나는 그렇게 생각하지 않습니다. 단지 레이크 작업 일뿐입니다. 당신이 확신 할 수 있도록 Dyno 설정으로 질문을 업데이트했습니다. – GregPK

답변

1

해결 방법 : 스위치 delayed_job에서 누구에게 관심이 있습니다 sidekiq

에 - 디버깅 몇 시간에도 불구하고, 나는 정말이의 바닥에 얻을 수 결코 없었다. 이 하나의 작업을 sidekiq으로 전환하려고 시도했습니다. 이로 인해 문제가 해결되었습니다. 아마도 특정 delayed_job -> heroku 상호 작용이있는 버그 일 것입니다. 어쩌면이게 오늘 일하게 될지도 ...

관련 문제