architecture - .NET processing queue communication between master and worker servers -
i'm trying figure out best way create master - worker architecture simple job delegation.
1 master process delegates jobs several worker processes. - master needs continually run , delegate jobs workers (and other tasks). - workers (on different servers) need receive job, process , report back. - master process receive queue of jobs , delegate them worker nodes process request , notify master job has been processed. master not need wait workers complete can delegate job , receive update worker when has completed.
what best way facilitate communication in .net? have class libraries job processing looking communication method.
msmq? windows service? remoting?
thanks
for purpose, used wcf net.tcp bindings, callback interface let master control program know job done (yep, called "the mcp", processes initiated jobs called "sark", , network referred "game grid", go figure).
"sark" implemented both console app , windows service (for ease of development , "dip our toes" new worker machine), while mcp long-running gui. if re-implement it, i'd make master control windows service well, advantageous department see jobs scheduled , pause mcp when backup job or other maintenance task due. today, i'd still make mcp service , provide gui "remote control" it.
the jobs written .net dll assemblies interface invoking task. "sark" copy latest version of binaries on fileshare, create new appdomain, load , run job within appdomain, shut down when done. made possible update jobs without having restart either mcp or "sarks".
in addition, each "sark" instance used msmq short-lived messages (a timeout of 10 seconds) report load on each worker machine. mcp use weighted random pick chose worker machine dispatch job to. is: if machine reported 80% idle, 80 'votes' take next scheduled task, meant more machine 10% idle. reasonably effective way of distributing load evenly while avoiding hotspots.
i chose net.tcp wcf binding dispatch jobs , receive results because "fire , forget" did not work in practice. failure happens: there out-of-memory exceptions, problems runtime on server (a major 1 when task needed work foxpro tables, , odbc foxpro driver microsoft didn't come in 64-bit version--a problem when had mix of 32-bit , 64-bit worker machines), or because of hundred unexpected contingencies such when worker on ip address firewalled machine wanted talk to. immediate response on callback gave mcp option re-assign job worker on different machine.
msmq, however, perfect reporting health of worker machines, because mcp didn't have running in order workers report burden, , short timeout on messages meant mcp wouldn't receive information stale.
Comments
Post a Comment