Forgot your password?
typodupeerror
Hardware Hacking Input Devices XBox (Games) Games Build

Combining Two Kinects To Make Better 3D Video 106

Posted by Soulskill
from the this-is-what-happens-when-yo dept.
suraj.sun sends this quote from Engadget about improving the Kinect 3D video recordings we discussed recently: "[Oliver Kreylos is] blowing minds and demonstrating that two Kinects can be paired and their output meshed — one basically filling in the gaps of the other. He found that the two do create some interference, the dotted IR pattern of one causing some holes and blotches in the other, but when the two are combined they basically help each other out and the results are quite impressive."
This discussion has been archived. No new comments can be posted.

Combining Two Kinects To Make Better 3D Video

Comments Filter:
  • by Anonymous Coward
    awesome
  • It's amusing how Kinect needs four microphones + calibration to replicate a feat we humans only need one ear for. To see 3D it apparently have to send out infrared dots, and even then it probably does a worse job than the good ol' brain.
    • Re: (Score:3, Interesting)

      There is a class of visual inputs that makes the human brain just tie itself in knots, even once you know that the trick is, "optical illusions", Escher stuff, and the like.

      I wonder what the class of "optical illusions" for the Kinect's vision system and algorithms is... Off the top of my head, I'd imagine that retroreflective materials might kind of freak it out; but I'd be curious to know if there are any stimuli that cause it to wig out in weird ways, the way that optical illusions do the human visual
      • by anss123 (985305)

        I wonder what the class of "optical illusions" for the Kinect's vision system and algorithms is

        I'm guessing kinect makes assumptions based on common human bone structure, e.g. something like a dog might freak it out and make it explode.

        • by JuzzFunky (796384)
          Check out this video where a guy has taken the model from the kinect and replaced the points with variable sized blobs. Looks cool. Fat Cat [vimeo.com]
      • Re: (Score:3, Interesting)

        by EdZ (755139)
        As the video demonstrates, the Kinect is fooled by spurious pattern projections from other Kinects in the vicinity. This could be solved by replacing the IR source in the 'projector' (actually a point source and a pinhole grid) with one of a different wavelength, and adding appropriate filters to the IR cameras in each Kinect. Each Kinect would then only see IR light of the 'colour' it emits. This would probably require the use of slightly brighter IR emitters.
        • by jrobot (1239050)

          CMOS sensors light gathering capabilities fall off over increasing wavelength.
          Silicon's quantum efficiency at NIR is much lower than visible. There's not a
          huge range of NIR to play in without QE falling off.

          IR diodes don't emit light over a single wavelength. Not only do they shift long with
          temperature, but the rated wavelength is really an average of the range the wavelength
          drifts over.

          Very tight bandpass filters tend to drift shorter in wavelength off axis.

          • by EdZ (755139)
            The Kinect uses an LED laser, so is truly monochromatic. You'd need some very narrow band-pass filters, but these are available, albeit sometimes bespoke.
        • This could be solved by replacing the IR source in the 'projector' (actually a point source and a pinhole grid) with one of a different wavelength, and adding appropriate filters to the IR cameras in each Kinect.

          Or maybe timing the grid light to be off while the other camera is on and vis-versa. Alternating back and forth quickly like the 3D LCD 'shutter' lenses. This way the grids would not interfere with each other. Don't have to turn off the cameras, just don't use the 3D grid data from those frames where the opposite grid is being used.

          Just my .01 worth. :)

          • The (testable but not yet tested by the public, to my knowledge) question is whether a Kinect unit needs a prohibitive number of frames withits own IR unit on in order to figure out what is going on.

            If it takes 30-60 frames to do so, that is only a 1-2 second delay, which is nearly irrelevant from the perspective of the standard use case. Just have the menu do some slightly sci-fi transition during that time and they will never even notice.

            If, however, you are trying to use two or more Kinects with sh
    • Think about it; the Kinect is given a job most of us would laugh out of town. Build a sophisticated camera capable of full 3-D input and peripheral pickup, using only water and jelly. Build an eye and ears.

      We don't know how to use jelly yet, so we settle with plastic and metal.

      Still a crazy task.

    • by JanneM (7445) on Tuesday November 30, 2010 @08:16AM (#34386982) Homepage

      The "good ol' brain" does a fairly crappy job, actually. 3D vision systems like these tend to perform quite a bit better than we do. And we only do as well as we do because we can use a lot of indirect clues based on our long experience with a 3D-world - we know how big stuff normally is, for instance, so we can judge distance from size. Mess up those clues and we completely lose it.

      And even with good clues we don't actually measure distance well. Have somebody place items on a parking lot or some place like that, then try to guess the distances. Not going to be very accurate. Try to estimate distance vertically rather than horizontally and you'll do even worse; you have fewer clues and less experience to fall back on.

      • I would say both would be very accurate, considering no actual measuring would be taking place. You can extrapolate points of reference:

        - A car is approx 3m long 2m wide. A parking space is about same
        - The lanes between spaces are 2 cars wide, to allow for idiots who can't follow the arrows.
        - Basic trig can give you any distance in a parking lot.

        The same applies to buildings. The average person is 6' tall, with 18" spare to the roof. The floor space is approx 6", making each floor approx 7'. Multiply $fl
        • Which is exactly what the parent said. Besides, look at it this way. You're using cars as an overlay grid. The Kinect is using a dot patter projected in infrared. What's the difference? Or, if you were to go to an empty grassy field, how would you distance estimates do?
          • You can get training if you really care.

          • Rugby pitches. 100m long. Divide or multiply as required. Plus, a healthy background in outdoor pursuits gave me a good eye for horizontal distance.

            Plain buildings a la MiniPeace. however, would throw me completely.
      • by TDyl (862130)
        And even with good clues we don't actually measure distance well.

        Yep, just look at the quarterback for the Carolina Panthers.
      • by Abstrackt (609015)
        Your comment reminds me of an interesting experiment you can do in 2D. Show people a page containing nothing but a creature that can't possibly exist and ask how big it is, obviously there's no way to answer without scale. If you put a picture of an elephant next to the creature it looks huge but if you put a picture of a mouse next to it the creature looks small.
      • by VShael (62735)

        That's typically a product of training. We don't have much experience with it, because we don't need it.

        But take an aborigine, and ask him to estimate how far something is, and you'll get a good accurate answer, even if it's not in feet and inches.

      • And even with good clues we don't actually measure distance well. Have somebody place items on a parking lot or some place like that, then try to guess the distances. Not going to be very accurate.

        And yet we are able to navigate and interact with our environment with a high degree of precision. When I'm driving a car, for instance, without looking at how fast I'm going, knowing distances, the weight of the car, my acceleration and deceleration capabilities, I'm able to stop at a line painted on the road to within half a meter. Just with my eyes!

        I work with robots, and even knowing all this information to a high accuracy, there is so much work that needs to be done with localization, navigation, plann

        • by JanneM (7445)

          And yet we are able to navigate and interact with our environment with a high degree of precision.

          Yes, we are. Our vision system is pretty successful when you look at how we actually use it in the real world. We don't actually need to know the precise distance to things; what we want to know is rather direction and time to impact and similar and we're really, really good at that (look up tau-margin estimation for instance). Though note that with a human-level vision system you would still need a lot of thos

          • by anss123 (985305)

            But I wrote this in reply to a poster that seemed to believe we humans are actually better than Kinect at the specific vision tasks it's built to do.

            But we are better. Kinect is built to recognize faces and body postures, it’s not built to estimate the distance from you to the TV even if it can do that more accurately than we can.

            • by drinkypoo (153816)

              But we are better. Kinect is built to recognize faces and body postures, it’s not built to estimate the distance from you to the TV even if it can do that more accurately than we can.

              That is a ridiculous statement. Kinect builds heightmaps. If that's not estimating the distance from you to the TV then I don't know what is. Kinect in fact does the other cool things it can do specifically because it is built to estimate the distance from you to the TV, when other camera systems are not. If this was ALL it would do you could still do the same stuff on the 360 in software, but it would take away from the available processing power which is why embedding it as a complete solution was the sma

              • by anss123 (985305)
                I think you misunderstood me. Building height maps is just a means to an end; the end being figuring out just what you're doing with those limbs of yours. The Kinect wasn't created/built for the purpose of measuring objects, even if it's better at this than us humans.
                • by drinkypoo (153816)

                  The Kinect wasn't created/built for the purpose of measuring objects, even if it's better at this than us humans.

                  That's a big fail of a response. The statement that prompted your original comment was "But I wrote this in reply to a poster that seemed to believe we humans are actually better than Kinect at the specific vision tasks it's built to do." and you said "But we are better. Kinect is built to recognize faces and body postures, it’s not built to estimate the distance from you to the TV even if it can do that more accurately than we can." But that is plainly false. Kinect is built to measure the distance f

                  • by anss123 (985305)

                    I simply disagree with everything you said

                    So you believe that Kinect is superior to humans at recognizing faces and body postures?

    • by Takichi (1053302)
      You are essentially just comparing the brain to the computer. We would likely have better spatial resolution if we had more ears and eyes as well. And most of the capabilities of the ear, especially in regards to space, is learned based on the combination with other responses like vision and touch. If you lived your life from the beginning with your only sense being a single ear, you'd probably do worse than a Kinect unless someone explicitly taught you what the things you were hearing meant, if you could
    • Do you really consider it a fair competition: your brain against a device that has a sales price of $150?
  • u don't conform to the character limit for sub-headings?
  • Anybody in optics? (Score:5, Interesting)

    by fuzzyfuzzyfungus (1223518) on Tuesday November 30, 2010 @07:55AM (#34386860) Journal
    How cost and/or physics prohibitive would it be to exploit the fact that "IR" actually covers a number of frequencies of invisible-to-the-naked-eye light with similar properties? Could one modify a Kinect with appropriate narrow-band filters, so that a second Kinect, with filters for a different narrow band wouldn't even see the dot pattern of the first? If possible, how many Kinects would it be possible for(or, at what point does the required narrowness and wavelength tolerance requirements become absurdly costly?)

    Is that A)Wholly impractical, because of some sort of effect the reflecting materials would have on the IR wavelengths, B)Sure, it's possible; but have you checked the supplier's price list for narrowband IR filters recently, or C)Just a bit of ebay and some steady hands?

    Perhaps more practically, I wonder if the Kinects could(with some mixture of hardware shutters and firmware or driver mods) be made to trade off sample rate for coverage(ie. if the kinects are ordinarily taking 60 frames/second, could two kinects be made to take 30 frames/second each, turning off their IR source when it isn't their turn, and turning it on when it is) or does their mechanism of operation require too much time to calibrate itself on startup?
    • by Xelios (822510) on Tuesday November 30, 2010 @08:49AM (#34387146)
      He touched on these ideas in another of his videos [youtube.com] from before this latest one.
    • Re: (Score:3, Informative)

      by Vario (120611)

      It is definitely possible to use some narrow bandpass filters. In the infrared region there are various filters for available that have a wavelength window of 10 nm at 1000 nm. These filters are not available at Walmart, but they are not too costly either. Depending on size, quality, wavelength and other parameters you should be able to buy some for $50 (Thorlabs).

      To actually hack the Kinect you have to test, whether there are other infrared filters used and if the camera is sensitive enough at different wa

    • Wouldn't polarized filters do the trick?

      • by slim (1652)

        Wouldn't polarized filters do the trick?

        As someone in another thread points out, polarity is lost when light is scattered as it reflects (3D cinemas have special screens).

        Also, polarizing gives you two channels. Bandwidth selection gives you many.

      • Not really as the surface absorbing the light has preserve the polarisation - and anyone who's setup a dual-projector 3D rig with polarised light can attest - you need a special surface coating to get good preservation of polarisation.

        Paint with silver particles in it is typically used for painting 3D screens, for example.

    • The best way to do this would be to modify the firmware to include some kind of pseudorandom modulation scheme (think binary chip sequence). However, the processor on the Kinect is a PrimeSense proprietary ASIC. Good luck reverse engineering it.

      Shuttering might work, but as you said, you'd reduce the overall framerate, meaning worse motion capture. Also you'd need to synchronize the shutters somehow, and that'd be a pain.

      Filtering would change the sensitivity of the camera, but it won't do much to
      • Why couldn't you filter both the diode and the camera?
        I.e. (Normal/Today's world):
        Diode 1 emits light across the entire IR-A spectrum (700 - 1400nm)
        Camera 1 detects light across entire IR-A spectrum (700 - 1400nm)

        Diode 2 emits light across the entire IR-A spectrum (700 - 1400nm)
        Camera 2 detects light across entire IR-A spectrum (700 - 1400nm)


        Apply filters to both emitter and detector, on both Kinect 1 and 2:
        Diode 1 emits light across the entire IR-A spectrum (700 - 1400nm), filter is applied so
        • It depends on the laser diode they're using. If it emits a wide enough band of light, then sure, it can be filtered. I just doubt that it does.
  • So wont 3 Kinects make 3D video?
    • by arshadk (1928690)
      You can't have 6 minute abs! 3 or 4 could make for a pretty cool image all the way round. I wonder if this could be paired with CAD software and a 3D printer.
    • I'd expect that 3 Kinects would make 4D video.

    • So wont 3 Kinects make 3D video?

      I get what you're saying. With 3 of these you should be able to get x,y,z coordinates. However, each of these is capable of getting the x,y,z for surfaces facing the camera, the problem is you need to hit all the surfaces. With 6 Kinects to cover front, back, left, right, top and bottom you could probably have the best coverage, but I expect four of them, one in each corner of the room like security cameras, would provide similar results.

  • This makes for real 3D movies. Capture the streams from both sources, combine in real time in the viewer, and you're able to change your PoV and focus independently of any other observer.

    This is revolutionary for entertainment. Not stereoscopy.
  • by Anonymous Coward

    Can you imagine a beowulf cluster of kinects??

  • Am I the only one imagining getting a Kinect or two in every room of their home and then use it to fly through the 3d video feed of their apartment?
    • by mrsurb (1484303)
      Plus potential advertisers [gearlog.com] will be able to try to sell you what they know you don't own!
      • No need to export the video feed to them as I just want the 3d video stream for myself and perhaps a few selected friends.
        Would be cool to insert avatars from several people there though. A 3d video version of the old MOO concept, or a local second life with a live video background to use a more modern analogy.
        Bandwidth will be an issue however.
  • With all this stuff in the news recently about backscatter machines and the need for improved x-ray machines, this sort of system would be fantastic for improving the quality of screening, being able to look in and see depth in luggage.

  • by bhunachchicken (834243) on Tuesday November 30, 2010 @08:33AM (#34387066) Homepage

    ... is good, but I'm holding out for 4 Girls, 3 Kinects, 2 Boxes, 1 Cup :)

    • Re: (Score:2, Funny)

      by Anonymous Coward

      ...and a partridge in a pear tree?

    • Re: (Score:3, Funny)

      by acoustix (123925)

      ... is good, but I'm holding out for 4 Girls, 3 Kinects, 2 Boxes, 1 Cup :)

      Correct me if I'm wrong, but if there's 4 girls then wouldn't there also be 4 boxes?

      Just sayin'...

    • by g1zmo (315166)
      And a jar.
    • by jhantin (252660)
      ... and zero good taste, apparently.
  • by John Pfeiffer (454131) on Tuesday November 30, 2010 @08:53AM (#34387174) Homepage

    When I first saw the video of one Kinect, I immediately wondered how you could get multiple units working together.

    It wasn't until I watched the video again later that day that it hit me. I had just explained to someone how 3D theater projection works, and so I had an epiphany: The most sensible course is to use polarizing filters.

    With filters on the IR emitters and cameras, the units should be able to only see their own IR illumination. Of course, it would only work for two Kinects with maximum effectiveness, but considering how well this turned out with the units at right-angles from each other, I don't see why you couldn't combine the two ideas for 3-4 units and get sufficient quality.

    I wish I had the money to get a couple Kinects and test my idea, but I'm no good with coding anyway.

    It'd be awesome to see the Blender Foundation put out a bounty for a Kinect-based open source motion capture and 3D scanning suite though. :D

    • by Anonymous Coward on Tuesday November 30, 2010 @09:19AM (#34387340)

      Unfortunately, this wouldn't work very well. Light tends to lose its polarization somewhat when it bounces off of things. In a theater that's OK because you can use a special screen that maintains the polarization. Band limiting each kinect would be more effective than polarization (and would also scale better - polarization only allows for 2 kinects; the bandpass idea would only be limited by how good your filters are).

    • Light is re-polarized when bouncing off of things, that's why people wear polarized sunglasses; it eliminates glare.

      Unfortunately, you wouldn't be able to predict the resulting polarization with great confidence off of curved surfaces at strange angles like bodies have.

    • by tibit (1762298)

      I'd do it in a different way that may well be lower cost and more scalable than any wavelength- or polarization-based selectivity.

      1. Run the Kinects off a common reference frequency. The onboard circuitry probably uses one crystal oscillator and PLL-controlled VCOs to generate various derivative frequencies to time everything. A common reference will keep all Kinects phase-synchronized, while the phase itself may well be random.

      2. Figure out how to discover the phase angle when the IR camera shutter is open

  • The results look an almost identical to the kind of data I get from the NextEngine [nextengine.com] 3D laser scanner. To create a 3D surface, the device sweeps a laser across the object in front of it. The laser sweeps a vertical line, and shines on the (arbitary) surface of the object in front of it. Stereo cameras capture the shape of the laser line from different angles, and software is able to extract the 3D surface from there. An accompanying visible light image from one camera or the other is used to apply a "skin
  • Wow, imagine a Beowulf.... blah blah.
  • 1. Find YouTube channel with worthy content
    2. Subscribe
    3. Share new videos on Slashdot
    4. ????
    5. PROFIT!
  • Basic Webcam (Score:2, Informative)

    by jgtg32a (1173373)
    Ya know to the best of my knowledge you cannot use the Kinect as a webcam in Skype. I would love to buy a Kinect but I need a reason other than awesome tricks, I need useful functionality.
  • Just wait for the flood of homemade 3d pr0n :) (hey somebody had to say it)
    • Actually, it seems likely to me that geeky (as in geeks will be the ones doing it) amateur 3D porn will become commonplace before commercial 3D porn.
  • Given a quality enough image, bandwidth, and some motion-sensing gear (ahem), any immersion-style display (HUD, dome, etc) could allow for real-time panning of a distant location.

    Examples:
    - shooting a net of these at an operating table would let remote viewers move around the room and view the procedure without crowding the room or limited to the perspective of the single camera.
    - a web site could point this setup at anything interesting (lab experiment, box of

  • Isn't it nice to know that someone at Microsoft could be checking in on our kids doing gymnastics? Most of us will just be leaving it plugged in all the time in our living rooms... I feel safer already.

"No problem is so formidable that you can't walk away from it." -- C. Schulz

Working...