Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

Voice Over IP for Linux Games? 147

Posted by Cliff on Monday June 11, 2001 @06:39PM from the better-frags-thru-team-communication dept.

fathom asks: "A few friends and I are attempting to move all of our gaming from the Windows platform to whatever distro of Linux we like. For the most part we've all had great success: just about everything we play is fine under Linux. However, there is one major drawback: we don't know of any software programs for Linux to do Voice Over IP like BattleCom, RogerWilco, and the GameVoice. Are there any programs out there like this for Linux?" Why limit to Linux? What Voice Over IP software can be used for any Unix that's flexible enough to work for other applications as well as games? We did a similar question about a year ago; has anything changed since then?

This discussion has been archived. No new comments can be posted.

Voice Over IP for Linux Games?

Load All Comments

Search 147 Comments Log In/Create an Account

Comments Filter:

Speak Freely (Score:1)

by Anonymous Coward writes:

Speak Freely for Linux [fourmilab.ch] might be what you need.
Re:VoIP in Java (Score:1)

by Anonymous Coward writes:

Using Java for VoIP sounds cool, but how did you overcome the terrible latency issue with Java Sound? Or did you go with less than 100% pure Java approach (e.g., JNI). I tried to use Java for VoIP once, but found that Sun's Java Sound implementation in Java 1.3 was quite lousy in terms of performance. I measured a capture/playback latency of about 200 msec with a simple audio loop program. This is about an order of magnitude greater than the limit of what I would find acceptable. In my opinion, Sun's Java Sound API is seriously screwed up. You can't even do full-duplex operation under Linux with Sun's Java 1.3/1.4beta. Until Sun fixes the latency problem and some of the significant javasound-related BugTrac bugs, I'll stick with c/c++ for VoIP.
Re:You forget cost/profit analysis (Score:1)

by Anonymous Coward writes:

I figured you might want a response from a game dev so here goes.

1. Why would I care about portability? Aside from my own personal desire to run a game on Linux, there's no reason. It isn't going to increase my profits in any significant way, and it's going to take time away that I could be using to do other things (like fix outstanding bugs).

2. SDL doesn't really do anything 3D-wise as you imply. What support it does have is OpenGL-centric which means I can't use the latest DirectX 8 features. DirectX is pretty good these days. If you haven't messed with it recently you should take another look.

3. The argument about MFC / QT is fairly moot. My company, for the most part, uses neither. Almost everything we'd use MFC for is available in the STL. And the rest (GUI stuff) such as our menus and interface code are done in-house.

Overall the codebase I work on IS fairly portable. But there's more to putting out a port than just recompiling.

The whole argument is time vs. money. If the extra time spent will generate enough profits to offset it, then porting a game gets considered. In most cases, however, the extra time for porting it, testing the port, keeping the port version synced with the Win32 version, and supporting the port are not worth it (with the notable exception of porting a game to a console such as the PS/2 or X-Box).
Re:Responsibility of Game Publishers (Score:1)

by Anonymous Coward writes:

game publishers need to get off their arse and provide a linux version of each release
I see you are another GNU/Linux zealot. I'm not sure if you realise a few things. First of all, if you speak/write like that to a game publisher and/or developer, you will piss them off and they'll just ignore you. I would just ignore you too. The publishers are not necessarily the ones that will provide a GNU/Linux version to a game. IF it is profitable to have GNU/Linux ports to games on a regular basis, it would be done. Most game developers and publishers have nothing against GNU/Linux. The publisher wouldn't care if they publiushed C64 games, as long as it made a profit. The problem is that not enough people will purchase them. Secondly, it can take a lot of work to port a game, depending on how heavily it depends on MS Windows APIs. Sure they could use cross-platform libraries such as Qt, wxWindows, and WineLib, however they will generally slow down the development process and make the game run slower. The point of making a game is to make money, not support some OS. I like GNU/Linux a lot, however from a game/profit point of view, GNU/Linux might as well be a TRS-80.
Re:VoIP in Java (Score:1)

by Anonymous Coward writes:

A scripting language (such as UT's) is hardly a JavaVM as far as memory is concerned.

I'm not so sure you know what a JavaVM is then. In windows its a collection of dll's that bytecode runs on top of. This alone doesn't make it a memory hog. go to the Tomcat [apache.org] website. Tomcat is a very nice web app server running on a JavaVM with a memory footprint of around 10 megs when processing jsp's, servlets and serving content. And much less when sitting idle waiting for requests.
Of course UnrealScript doesn't run in a "Java"VM, but it does run in a virtual machine.
And your python example is nice, but way too stripped down to realistically use it for a VoIP app. Like the original post said, most of the necessary stuff was already in the existing Java API's.
With your embedded vm, you would be starting from scratch
Try these out (Score:3)

by Anonymous Coward writes: on Monday June 11, 2001 @02:53PM (#159464)

Open H.323 [openh323.org]
sipc [columbia.edu]
Video Conferencing for Linux [hananet.net]
Voice over IP technologies are the same as those used for video conferencing, but with audio codecs only. The two VoIP/VideoConf standards for call setup and control are H.323 and SIP.

Share
twitter facebook
Re:Speak Freely for Unix (Score:1)

by Brett Viren ( 296 ) writes:

Speak-freely is a venerable program and there is both a Linux and a windows version which can intercomunicate, but its sound quality is pretty poor. `ohphone' gives much better sound quality and is easier to set up.
H.323 (Score:2)

by Brett Viren ( 296 ) writes:

Coincidently I was just starting to mess with H.323 this weekend. With a direct net connection, one can use `ohphone' with out needing any other programs (except another one to talk to, of course). It's pretty simple to set up and use. If you are behind a NAT router things are trickier (I haven't gotten it to work yet). There are Debian packages for `ohphone' and a few other H.323 packages.
I was also able to use openmcu to provide rudimentary group conference services. This is also packaged for Debian.
BTW, with ohphone you can talk with windows Netmeeting users.
Re:Responsibility of Game Publishers (Score:1)

by jedidiah ( 1196 ) writes:

It's not the developers that are relevant here but management. Game developers by definition are a rather "nerdy" lot. Claiming that you need to have a developer with a special axe to grind is simply false.

Also, multiplatform programing subjects the development process to constraints that it might not otherwise have. These constraints actually ultimately benefit the development process by making it more structured.

Besides, a popular PC game is bound to be ported to "one of those other non-DirectX platforms" sooner or later.

Infact, EA's most profitable game platforms don't even run DirectX... (just read their annual reports)

The managers are the bottleneck, not the developers.
Re:Let's avoid being ANAL (Score:1)

by Thomas Charron ( 1485 ) writes:

I was purposely avoiding this thread, but you summed it up in a very nice way. Not as nice as I woulda been, but nice.. Thanks.. 8-)
Re:Have you considered... (Score:1)

by Thomas Charron ( 1485 ) writes:

*POP* ^ | (Brain cells overloading)
Game publishers need $ figures (Score:1)

by Matt Lee ( 2725 ) writes:

Petitions, lists of names, and pleas will not get you Linux games; money will. There's been a few high-profile games released with Linux versions, and their sales figures were abysmal.

True, almost all of these games had some weird circumstance that may have contributed to the poor sales figures (like release timing vs. win32 version, etc), but regardless of the circumstances, Linux games just don't sell.

And let's not forget the difficulty of deploying and supporting PC games on Linux. DirectX may not be perfect, but it's a huge, huge positive for Win32, especially in terms of installation and support -- and it's getting better all the time. OpenGL is nice as well, and cross platform to boot, but try providing a concise, clear set of instructions (or a single installer) that gets accelerated OpenGL running on Linux across a wide variety of consumer graphics cards. It's just not there. Also, I'm sure the various small differences between Linux distros can prove to be a headache as well.

A game platform needs some sort of central authority controlling the feature sets, the quality of drivers, and a consistent, usability tested installer. This central authority provides a lot of platform stability while sacrificing a small amount of absolute developer freedom. Microsoft provides this for Windows, and it works remarkably well considering the wide variety of hardware and software configurations out there. No one entity is centrally controlling and planning the gaming experience on Linux, and this is bad.
Re:Compression too! (Score:1)

by peter ( 3389 ) writes:

gzip doesn't really help much on audio. I wouldn't bother, since I'd only use this on a local network anyway. I guess if you're encrypting, then gzip doesn't add much overhead, though.

I actually have used something like this, but with netcat execing an mp3 decoder with stdout going to a socket. The other end of the socket is on a computer too slow to decode MP3s in real time, running rawplay with stdin coming from a socket.
The slow machine (a P75) has bad memory bandwidth too, so the 1.5Mb/s data stream still slows it down noticeably. Fortunately, I'm shuffling around my computers, and I'm retiring the P75.
#define X(x,y) x##y
Re:VoIP in Java (Score:2)

by peter ( 3389 ) writes:

You might want to use Vorbis. (see www.xiph.org) The encode/decode functions are in a library, so it should be easy to use. You could look at the source code for oggenc to see what library calls it makes. Check out the docs at http://www.xiph.org/ogg/vorbis/doc/vorbisenc/ (some of the links in the encoding library API ref are 404, it seems...).

Another reason for using vorbis is that it will keep Fraunhoffer from suing you for using an MP3 encoder.
#define X(x,y) x##y
Team up w/ Apple (OS X) - MORE BUYERS! (Score:1)

by undo ( 3635 ) writes:

Look out for the Mac OS X versions... The combined markets of linux w/ OS X is exactly what linux needs to boost it's purchasing power in the market...
VoIP in Java (Score:4)

by Drake42 ( 4074 ) writes: on Monday June 11, 2001 @02:52PM (#159474) Homepage

Hi All,

I wrote a trivial test app using the java sound API to make a VoIP program. It didn't implement any kind of standard, and it was completely insecure, but it worked after a relatively small amount of effort and it performed really well.

Java Sound passes just about everything through to the card so Java vs C didn't really come in to play much. All I did was decide that one machine was going to play server, and then everyone who connected to that machine got their byte streams mixed using the java Mixer and then sent back the mixed stream.

I'm up to my neck it projects right now, but if someone wanted to lead it up, I'd submit code and experience. Then we wouldn't have to worry about platform at all.

Jason

Share
twitter facebook
Re:You forget cost/profit analysis (Score:2)

by Zach Baker ( 5303 ) writes:

NB: I'm unrelated to the previous poster. I just thought I'd chime in with my own depressing take on things.
Excellent, I was hoping to get a reply from a developer.
1) Do you use Linux much? Have you used it for any game/non-game development? It is definitely capable of running games. Windows is only worthy of games because of the huge marketshare, not because it is a better gaming platform. It is also obvious that the Linux community wants games. Are you part of that community? Do you not agree?

I think you would be better off asking a major publisher these questions. Developers who would like to support Linux and can afford to often do so. Also, BeOS was an even better gaming platform than Linux or Windows, but look how far quality gets you in this race.
Also, porting to the Mac tends to be an afterthought. As in, let's worry about making the "real game" (on the primary platform) as freaknasty as possible, and think about possibly offloading the code on a Mac developer later.
And finally, regarding developers that wish they could be working in Linux, I presume you weren't referring to artists, level designers, testers or producers. ;^) Sorry to be brutally honest, but hey, remember the PS2 does run Linux kick-assedly (although it's only been available in Japan [playstation.com]... so far [fakeroot.net])!
Summary of current market conditions: (Score:2)

by Zach Baker ( 5303 ) writes:

JOE LUNIX

You should make your upcoming game for Linux. It is technically
superior in many obscure ways.

PC GAME DEVELOPER

Sorry, we could only pick one operating system, and it turned out
to be Windows. Better luck next time.
Re:Let's avoid being ANAL (Score:1)

by JanneM ( 7445 ) writes:

Well, that _is_ a file transfer protocol, not to be confused with the file transfer protocol known as FTP... Not stupid, not a problem.

/Janne
Re:Let's avoid being ANAL (Score:1)

by JanneM ( 7445 ) writes:

Hmmm... Maybe I wasn't clear enough.

If you tell me to use 'file tranfer protocol', I'll assume that you mean FTP (and is - for some reason - on a let's-expand-all-acronyms-trip). If you say to use 'a file transfer protocol', I'll probably start by trying ssh, as it is safer, then FTP if ssh doesn't work (and rsh never, ever).

In the same vein, if you are talking about VoIP, you are talking about that specific protocol. If you are talking about 'voice over IP', that could mean anything (well, as long as it's about voice over IP, that is). This is even more so for 'voice over IP' than for 'file tranfer protocol', as FTP is a far more well known, widely used term than is VoIP.

So, VoIP is a specific protocol, 'voice over IP' is a technology application with many potential specific solutions.

Then we could get into the whole area of commoditisation of terminology (think thermos), but that would lead this post far beyond what is reasonable...

/Janne
Re:Wow! (Score:1)

by JanneM ( 7445 ) writes:

Hey, don't try to agree to things here! Where would academia (or open source) be if everybody suddenly started agreeing with each other? :)

Seriously, this is of course a rather pointless, unimportant argument, and (as a previous poster impolicated) this will all be sorted out through the normal language mechanisms of common usage. However, pointless, unimportant arguments can be rather fun (as nobody gets seriously offended), so here's my 2 cents:

The point made above that 'Voice over IP' was capitalized is an entirely valid one (and I totally missed that). When the poster talks of 'common usage', however, I feel it breaks down somewhat as (unlike FTP) the term simply isn't very common.

/Janne
Re:[OT] FFT. (BTW) (Score:1)

by Christopher Thomas ( 11717 ) writes:

BTW, I just wanted to say that its been a pleasure discussing things with a person who actually knows what they're talking about and doesn't dodge questions :)

Likewise.

I should probably email you with my email address, as these articles will be archived soon, preventing further comments (especially the antimatter propulsion one). Is your listed email address accurate (modulo spam-removal)?
[OT] FFT. (Score:2)

by Christopher Thomas ( 11717 ) writes:

Do you know why you wouldn't use an FFT for audio or video compression?

Out of curiosity, why?

Dabbling in signal processing and audio/video compression is one of my hobbies, though I'm only an amateur.
Re:[OT] FFT. (Score:2)

by Christopher Thomas ( 11717 ) writes:

So, the FFT breaks down a signal into both a sine *and* cosine, summed together (they use a really neat math trick involving imaginary exponents to do this :) ). The only problem with this for compression is that you end up with twice the amount of data you started with!

I probably should have mentioned that I already know how the FFT works; I was just wondering why it would be unsuitable.

I'd get around it producing twice as much data by discarding half of it. The real and complex components of the spectrum of a purely real signal will be symmetric and antisymmetric, respectively, so I can discard half of each and then reconstruct them before I do the IFFT when unpacking the compressed data.

The DCT can be mathematically derived from the FFT by doing similar tricks, as you're probably already familiar with (you treat the input waveform as half of a symmetric signal, which has no imaginary component in its twice-as-long FFT).

Out of curiosity, what is the basis behind the MDCT? I've heard of it but haven't seen an explanation of how it works.

[OT: I replied to your antimatter post with some of the information you asked for. Short version: It's really, *really* expensive to produce, and solar radiation isn't energetic enough to help.]
Re:[OT] FFT. (Score:2)

by Christopher Thomas ( 11717 ) writes:

well for realtime video at least --- it it sloooooooow.

Actually, the FFT is quite fast. It works in near-linear time on the data (O(n log n)). It's the DFT that's the slow version.
Re:[OT] FFT. (Score:2)

by Christopher Thomas ( 11717 ) writes:

The real issue is that you can't just throw away half of the data. If all you care about is magnitudes, then yes, you can take the magnitude of the energies for a given frequency. However, if you want to reproduce the original signal, you can't.

This would be the case if I was throwing away all of the real or all of the imaginary data, but I'm keeping half of each.

The spectrum really is perfectly symmetric for the real component and antisymetric for the imaginary component. Deviations from this symmetry would cause an imaginary component to exist in the input signal, which can a priori be assumed not to be the case. Phase in the real signal is encoded in the ratio of real and imaginary spectrum components for a given frequency.

The handwaving argument from information theory is that because I'm only encoding N values for N samples (the real components of the samples), I only need N components out to retrieve all of the information (the real and imaginary components of half of the spectrum). The full spectrum has a factor of two redundancy.

Again, the DCT makes the same assumptions I do in throwing away this data (it just does some additional tweaking on top of it). I can show the derivation if you like, but you probably already have it on file.

Your signal processing library is more complete than mine :). My library just does 1d DFT and 1d FFT, though the n-dimensional versions are numerically equivalent (you just have to chop up and reassemble the 1d spectrum).

BTW - I'm looking for a way of doing the equivalent of a FFT on a non-power-of-2 number of samples. Zero-padding produces an interpolated spectrum containing more information than I need. Any other scheme I can think of takes as much work as a DFT of the samples would. Any thoughts on this?
Tribes2 uses GSM (Score:5)

by Caballero ( 11938 ) writes: <daryll@dar[ ].net ['yll' in gap]> on Monday June 11, 2001 @03:12PM (#159485) Homepage

Loki had the same sort of problems when they ported Tribes2. They switched over to a freely available GSM encoding (from a university in Germany). It worked so well they're adding the code to the Windows version so you can chat between versions.

Share
twitter facebook
Re:sure it's been done ... (Score:2)

by Mullen ( 14656 ) writes:

As a person who plays Tribes 2 alot, I am agaist putting the VoiceIP stuff into the game. Tribes2 VoiceIP is horiable and I know that other companies can do alot better job at the VoiceIP than the makers of Tribes2.

--
Re:*nix based Voice over IP is easy! (Score:1)

by DarkToast ( 18370 ) writes:

We actually once tried bringing live audience into our Shoutcast radio. We encoded the Shoutcast stream by piping esdmon | lame ... | shout ..., and thus could mix multiple music sources (from EsounD-supporting players) as well as speech.
For the remote speaker, we tried running esdrec and then esdmon | lame ... | nc broadcaster-host:12345. For the broadcaster, we joined the remote speaker into the stream by: nc -l 12345 | mpg123 -s - | esdcat.
The only problem was - it had a latency of about 5 seconds :)
Add voice to old games with viavoice (Score:2)

by tap ( 18562 ) writes:

Paradise 2000 [speakeasy.org], a Netrek [netrek.org] client for Linux, uses IBM's viavoice for linux to get speech output. Netrek is probably the oldest real-time graphical game on the internet. It has a sophisticated text messaging and macro system, but pre-dates normal computers even having sound hardware, much less the power for voice over IP.
With a text-to-speech system, you can get voice output without having to worry about bandwith issues, poor quality sound, or people without a microphone.
With Netrek's RCD macro system, it's pretty nifty the things you can do. For example, a player who is in a base is hurt, and pushes a single key for generic distress, causing everyone on their team to get a message like:
F0->FED Help(SB)! 0% shd, 50% dmg, 70% fuel, WTEMP! 9 armies!
But your client will speak, "Base hurt, weapon temped", because all those numbers are a pain to listen to. Later the base is ok, so he pushes the same key.
F0->FED Help(SB)! 99% shd, 0% dmg, 99% fuel, 1 army!
Now the client just speaks, "Base is ok". The macros can have "if" statements based on the relevant values, e.g. if(damage>50) "hurt" else "is ok". It's a lot faster to just push a key than to say the relevant information. And if you don't have all the noise, you don't lose text communication with your teammates.
BTW, if it wasn't obvious, this is a shameless plug.
Re:Let's avoid being ANAL (Score:1)

by No-op ( 19111 ) writes:

Actually, he's not really being anal; for those of us who actually work with VoIP, people who randomly throw these terms around due to lack of knowledge annoy the living hell out of us.

You'd probably get pissed off if I compared visual basic and C++, but to me they are more or less the same thing... so I avoid making those kinds of comparisons. do the same for us :P
VOCAL (Score:2)

by NitsujTPU ( 19263 ) writes:

http://www.vovida.org/ if you search around this website, you should come across a program called VOCAL, which is known to compile under linux, and provide some degree of VoIP support.
Wow! (Score:3)

by mindstrm ( 20013 ) writes: on Tuesday June 12, 2001 @06:17AM (#159491)

Guys, I wasn't trying to be anal, only educate people, because I know a lot of poeple are not aware there is an actual protocol called VoIP.

The comparison to ftp is entirely accurate.

I never said anyone was wrong, only that we should avoid confusion.

Share
twitter facebook
Let's avoid confusion.. (Score:5)

by mindstrm ( 20013 ) writes: on Monday June 11, 2001 @03:05PM (#159492)

VOIP, or Voice Over IP, has come to mean a specific suite of protocols for providing telco integration, call setup/teardown, etc. We shouldn't use it as a generic term for 'speech over internet' anymore..

Share
twitter facebook
Re:Will this matter once games begin including voi (Score:2)

by ywwg ( 20925 ) writes:

is this feature in the linux version as well?
Re:Speak Freely for Unix (Score:1)

by joekool ( 21359 ) writes:

Back in the day(94?), when DOOM was new, my best friend played it on his laptop, which had no sound, as was common at the time. That meant DOOM played over the crappy pc speaker on the laptop, which was just annoying. So he played with no sound. In a completly dark room, with no sound, it could be a very creepy game.
Re:Have you considered... (Score:2)

by WNight ( 23683 ) writes:

Speech has more redundant information that most audio signals... A large factor in speech recognition is context.

How lossy are you willing to have this?

Written english contains something like 1.6 bits of entropy per character, meaning aproximately 10 bits per word. That can be pumped into a speech synthesizer and result in *very* lossy representation yet it maintains almost 100% of the meaning. (Ignoring fine details like inflection, etc.)

The best way, from a bandwidth point of view would be to detect phonemes and transmit only those, recreating the audio on the other side. This is a 'bit' CPU heavy for realtime...

At any rate, MP3 encoders might not be the best thing to use, they're tweaked to work well on music not speech. Likely you'll create a larger file than you need. Simply mask out all frequencies below 500hz and above 3000hz (I think) and apply a simple logrithmic encoding and you'll get fairly decent compression. It's also fast.

Nice to see you still posting to Slashdot, I had thought you weren't here anymore, being that you vanished in the middle of a thread... Or, do you not check your posts (slashdot.org/users.pl) to see if you've had any responses?
Re:Have you considered... (Score:2)

by WNight ( 23683 ) writes:

Sorry, I forgot I was talking to the person who proved Fermat's last theorem when she was three... Or was that only solve the traveling salesman problem in constant time... Oh, no, just an expert at audio compression, the only opinion that matters in any matter philosophical, and the first person to solve the Monty Hall problem.

I really wish you'd read the posts before you reply to them, you'd be a lot more relevant.

I never said MP3 wasn't usable, I said *MP3 ENCODERS* weren't a good bet. The MP3 encoders ARE tweaked towards music, that's what people tend to use them for, and what they are engineered to do well.

You made a comment about just finding one, tweaking it a bit, and having a great solution. The only open-source ones I've seen have been designed with "MP3s", or 128kbit music... If you used one of these you'd have a lot of tweaking to do to get decent compression from it.

It also doesn't fit the problem, as I saw it. This thread is about communication for a game. Barry White tends to perform very few concerts via Quake3 server.

While I'm at it, I think I should mention that telephone calls are compressed, usually with hard arbitrary cutoffs, and MuLaw encoding. Also, the whole point of logrithmic encoding is to have more resolution for quiet sounds, instead of the loud sounds. What you describe is exactly backwards from the way it works.

I assume though, that you find the quality of voice over the toll networks to be unbearably bad though.

You know, I did say "I think" after the numbers 500hz and 3000hz, that sounds right, but I may have misremembered. I even marked it as such. However, arbitrary cutoffs sound only marginally worse than a smoother cutoff, and usually only on specific benchmarks.

If you'd prefer a touch more CPU, then by all means, use a smooth falloff. I assumed speed was all-important because of the gaming aspect... I wouldn't want a 20% speed hit from audio encoding.

When you're doing audio compression, delaying the signal for 50 or 100ms is safe, and gives you all the context needed. I also understand about perceptive encoding, not that it's a huge technical achievment as you make it out to be.

As with the gun control thing... I think you've got an agenda and you can't see that it's not mine. I don't give a rat's ass about gun control in your country. I do however find it annoying when someone throws around a bunch of misused statistics and emotional ploys to deceive people as to the severity of the problem. If you really had a strong point, you wouldn't need to use a bunch of tricks.

And you don't really seem bitchier, you were more insulting in the other thread and just as quick to anger, without reading the post.

Don't waste your precious time on my account, if I want to have this kind of technical discussion I'll go talk with the marketing team.
Re:Have you considered... (Score:2)

by WNight ( 23683 ) writes:

I think the key in your question is the word 'good'...

Many encoders do this (arbitrary cutoffs, etc) but they aren't designed from the same point of view as an average encoder on a PC...

The reason simple *Law encoding is so popular in telephony is that it takes very little CPU time, on embedded CPUs without an FPU. It also produces a constant-sized output. That's very handy in networking with technologies like ATM where you can reserve bandwidth.

If you have CPU time to burn, even 10% of a modern CPU, then doing a perceptual encoding of speech in realtime gets your better quality at comparable sizes. But if you're trying to do this with so little CPU as to be transparent to the primary application (usually, in this discussion, a game taking 100% of the CPU) or on a tiny 8-bit CPU with 64 bytes of RAM, FFTs (etc) aren't an option.

The reason for the hard cutoffs is that they're usually done in hardware while the signal is still analog. I believe you can get current soundcards to do something like this while recording. If you can't, the work required to seperate out the unwanted frequencies through software means you should probably do a more advanced encoding.

If you're curious though, try an 8000hz voice sample with cutoffs where Gordonjcp suggested, then if your audio program supports it, try 8bit logrithmic encoding. (Not just 8-bit linear encoding, it's much worse.) It's definately not worth encoding music in, but it's not bad for voice. (After all, the telcos use it.)

What do you do for work? I have a feeling we're both in opposite ends of the same field.
Re:Impressive (Score:2)

by gmhowell ( 26755 ) writes:

Ummm... Go reread the post. I think it was humor.
Re:sure it's been done ... (Score:2)

by gmhowell ( 26755 ) writes:

Excellent. And with that DirectX 8 that is due out for Linux, he'll be all set!

Seriously, while a PITA for Linux, it will be great for 'Doze games.
Signalling + transport (Score:1)

by habib23 ( 33217 ) writes:

What you want is a nice SIP API. SIP will be your session layer in this case, allowing for standards based signalling between all the endpoints. In addition, it can be used to provide integrated instant messaging and even setup the gamimg session itself. Much progress has been made in the past year, and real implementations are starting to appear. The Vovida stack would probably be the best place to start, but a slightly higher level interface might be helpful for this sort of application (a GPL'd stack wouldn't hurt either, they use the Vovida license). For those interested in SIP checkout Henning Schulzrinne's SIP page @ http://www.cs.columbia.edu/~hgs/sip/
Re:What about Macintosh? (Score:2)

by darmou ( 34207 ) writes:

Rodger Wilco has a mac version as well
UCL Conferencing tools (Score:1)

by Cabby ( 39912 ) writes:

We've used RAT [ucl.ac.uk] from UCL quite sucessfully for VoIP calls in Linux (and between Linux and other platforms). It uses RTP for things like video and audio synching too and has handy gateway transcoding services available to it to translate between codecs. There's also a companion video client, VIC and a whiteboard system too.
Some of the releases are a touch tempremental but you should be able to make one of them work!
MacOS X could help. (Score:1)

by dmaxwell ( 43234 ) writes:

OS X is being touted as a OpenGL friendly desktop OS. While Mac game selections aren't as bountiful as Windows, the situation is better than in Linux. When games start appearing that are OS X native, the resulting sourcecode should be much easier to port the rest of the Unices. Granted the UI layer is completely different but the hardcore game engine components won't be so bad. Hell, it might even give GNUstep a large amount of interest. The GNUStep libraries would probably ease much of the pain of porting the UI stuff as well even if they aren't employed to backend 'nix desktops.

Games that are intended to sell well on the Mac and Windows have to be somewhat portable. The OS X environment resembles it's fellow Unices more than it does Windows. Of course, Microsoft can release DirectX libraries for OS X and not Linux..........

I've seen a lot of posts to effect that MacOS X is bad for Linux and will kill it's mindshare. OS X is no more bad for Linux than Linux is bad for the BSDs. Sure more people may use it as a desktop but it could mean that more commercial software comes to Linux (and the BSDs). A rising tide floats everyones boat.
Compression too! (Score:3)

by dmaxwell ( 43234 ) writes: on Monday June 11, 2001 @06:31PM (#159504)

ssh has the -C flag that uses the same compression algorithm as gzip. That might make the above scheme barely practical. I know, I know, it was a joke..

I will point out that ssh compression works wonders on a vnc session if the server is running sshd. There is a nice howto on the vnc page for tunneling the connection over ssh. Secure and faster....bonus!

Share
twitter facebook
suggestions (Score:2)

by DanThe1Man ( 46872 ) writes:

Will Linux Telephony [linuxtelephony.org] do the trick?
If how about useing Speak Freely [speakfreely.org] or Open Phone [openphone.org]?
Re:Tribes2 uses GSM (Score:2)

by Dwonis ( 52652 ) writes:

GSM is patented, I think.
------
Re:Have you considered... (Score:2)

by Dwonis ( 52652 ) writes:

Please... if you're qualified to discuss audio compression, how about the basics? Do you know how to compute an FFT? Do you know why you wouldn't use an FFT for audio or video compression? What about a DCT? MDCT? What do you know about quanization schemes? The advantages/disadvantages to storing quantized data with huffman encoding vs. arithmatic encoding? Have you ever written a single signal processing function? (I've written a whole library). Do you know anything about the subject at all?
I'd like to step in and ask: where I can find that information?
------
Re:sure it's been done ... (Score:2)

by Dwonis ( 52652 ) writes:

Eww... I'd never use DirectX for Linux.
------
Re:Tribes2 uses GSM (Score:1)

by QuoteMstr ( 55051 ) writes:

BSD-like license.
Re:Speak Freely for Unix (Score:1)

by spinkham ( 56603 ) writes:

Depends on your soundcard drivers.. The SB live I have has drivers that let multiple streams into the card, so you could hear and have your quake audio...
Re:HawkNL (Score:1)

by spinkham ( 56603 ) writes:

Mod this one up ;-)
Not only is this a cool project, they have good comparisons of open source implementations of telephony codecs..
Re:Will this matter once games begin including voi (Score:1)

by dinky ( 58716 ) writes:

Yep. They've added a GSM VoIP codec in the latest patch which works on both platforms (Windows and Linux).
Re:Responsibility of Game Publishers (Score:2)

by Zigg ( 64962 ) writes:

Petition, schmetition.

There are Linux game publishers out there. If their unit sales numbers were comparable to numbers for Windows games, then you'd see more publishers writing to Linux.

All the petitions in the world can't make an unprofitable situation profitable.
Re:waste (Score:1)

by Nevrar ( 65761 ) writes:

Um. Like yes it would.
*nix based Voice over IP is easy! (Score:5)

by dougmc ( 70836 ) writes: <dougmc+slashdot@frenzied.us> on Monday June 11, 2001 @03:49PM (#159515) Homepage

We've been doing `Voice over IP' for years, far longer than it's been `cool' -

cat /dev/audio | rsh remote-host 'cat > /dev/audio'

Well, for the year 2001 you may want to use `ssh' instead of `rsh', and /dev/dsp instead of /dev/audio for Linux, but the idea is still the same ...
Almost as much fun as making the Sun next to you belch while the newbie is using it!

Share
twitter facebook
Re:Responsibility of Game Publishers (Score:1)

by Gummbah ( 72706 ) writes:

As you know, id Software did a Linux version of Q3A. After a few months there was an interview with their CEO Todd Hollenshead (I can't for the life of it remember where, though. Perhaps someone from id would like to clarify?), claiming that the sales of the Linux version were very low. Low enough to not make it worth their time (though he said they will keep supporting Linux in the future).
ad
Re:Back up those assertions, please (Score:1)

by technos ( 73414 ) writes:

Well excuse my caps! They were in deference to the replied-to post! (Oh and bidirectional, compound that implied condition as another error as well)

Not logged in, guess who.
Re:Back up those assertions, please (Score:1)

by technos ( 73414 ) writes:

Wait, WTH is going on with the cookies?
Re:Speak Freely for Unix (Score:2)

by technos ( 73414 ) writes:

I use it all the time. (GF in CA, I'm in MI) The Linux version can be a bitch on some soundcards (ES137x and S3 SonicVibes) compared to the Windows version.

What else to say? It'll do multicast conversations, and coexist nicely in LPC10 mode with Quake on a 56K so long as you don't mind not being to hear Quake save the CD audio.. The delay can be unbearable. If I yell into the mic, I can hear myself five seconds later some nights. Others, it's nearly instantaneous.
Re:Speak Freely for Unix (Score:2)

by technos ( 73414 ) writes:

If you're referring to the ohphone that is a part of the OpenH323 stuff, you're only right because of the bitrates involved. The largest amount of bandwidth you can eat with SF is about equivalent to what you consume using G.723, one of the better compression ratio H323 modes.

Also, using H323 as a carrier would mean you need at least a 1/4 cycle (You are only sending or receiving data 1/4 of the time) peak throughput equivalent to a capped cable modem; I've used Speak Freely on as little as a 9600 baud modem in 100% cycle operation.
Re:Let's make one... (Score:4)

by technos ( 73414 ) writes: on Monday June 11, 2001 @03:05PM (#159521) Homepage Journal

GSM, the standard used in cell-phones, works fine with less than 1/5th of that bandwidth (bidirectional in 2.4-2.8kbps, will do realtime compression in a 486, and there are freely available GSM compressors.

There are also public domain encoders for the military voice standards, LPC and LPC10. Those are usable in as little as .96kbps, but they sound horrible. in comparison.

Share
twitter facebook
For Half-Life engine fans... (Score:2)

by antdude ( 79039 ) writes:

If there was a Linux port of Half-Life, then this upcoming voice feature [gamespy.com] would kick butt. We wouldn't need these voice utilities.

Off-topic, this upcoming Spectator feature [gamespy.com] would be sweet as well. :)
Half-Life 1.1.0.7 will have it (Score:1)

by kimihia ( 84738 ) writes:

I was reading today that the next patch for Half-Life will include funky little voice communication.

The article is on Gamespy, but b'dern it, Links is not letting me grab the text with my mouse. Um ... I think this is the link [gamespy.com].
Re:Let's avoid being ANAL (Score:1)

by _UnderTow_ ( 86073 ) writes:

Seriously, if you're speaking and your voice is being carried to it's destination via IP, then it's Voice over IP. I don't think anyone is confused by that.
Re:Open H.323 (Score:2)

by brunes69 ( 86786 ) writes:

I have used this and can attest to it working. It not oly worked, but worked fine connecting to a netmeeting user as well. So it is also viable for connecting to window users
"...to any Blizzard product..." (Score:1)

by Gerad ( 86818 ) writes:

Um, hello? Have you ever PLAYED Diablo 2? That makes EverQuest look NICE.
Tribes 2 Seems to Incorporate One (Score:2)

by Greyfox ( 87712 ) writes:

I noticed that the Linux Tribes2 seems to have some GSM thing built right in. Haven't fiddled with it yet, as I think my Microphone is dead and the game seems to run much too slowly on my system anyway (Time to spring and get that geforce2 *sigh*...)
Speak Freely for Unix (Score:5)

by Bigtoad ( 88790 ) writes: on Monday June 11, 2001 @02:45PM (#159528) Homepage

How about Speak Freely for Unix [fourmilab.ch]?

I have played with it a bit, and it seemed to work, but I haven't actually used it for gaming yet.&nbsp It didn't seem as simple to configure and use as some of the windoze voice comm programs, though.

Share
twitter facebook
Freshmeat (Score:3)

by rosewood ( 99925 ) writes: <rosewood@@@chat...ru> on Monday June 11, 2001 @02:47PM (#159529) Homepage Journal

I know when I was looking for tools like Net2Phone and Dialpad.com for linux I found a few SDKs on Freshmeat for VoIP. However, none of them were geared toward gaming such as Rogerwilco and Battlecom. I would love to find a good cross platform "game-geared" VoIP system. Btw - only RWBS is for Linux - no linux clients that I know of

Share
twitter facebook
Re:VoIP in Java (Score:1)

by borzwazie ( 101172 ) writes:

perhaps a great candidate for a sourceforge project?
There was an IP phone in the perl journal recently (Score:2)

by marcmac ( 105570 ) writes:

TPJ had an IP phone in ~70 lines of perl - this should be adaptable to your purposes, I'd think...
Let's make one... (Score:1)

by SirPoopsalot ( 111075 ) writes:

I've been toying around with a streaming MP3 system that would allow for unlimited conference sessions to take place, even if the client is on a modem. great for team games, and you can use as low as 16kbps of your total bandwidth to do it.

The central server has to have some pretty processing power and bandwidth, but it'd work.

I can write the linux server and client, but I don't know anything about programming witht he soundcard in windows.
sure it's been done ... (Score:2)

by Frizzled ( 123910 ) writes:

but, to date it hasn't been done well.

Roger Wilco [rogerwilco.com] is notorious for using up resources and being near-unhearable and Battlecom [shadowfactor.com], while pretty solid, also has problems with static creeping into the transmission (interestingly, ShadowFactor is about to discontinue Battlecom).

One solution is simply to build Voice-Over-IP into the app ... in the case of games, Valve's next patch to the Half-Life engine will contain code to do this very thing [gamespy.com]. It should be interesting to see if a company has more luck trying to write this code directly into their product, rather than replying on other companys to create add-ons (like BC and RW).

_f
Re:Let's avoid being ANAL (Score:2)

by SuiteSisterMary ( 123932 ) writes:

That's like saying that any protocol that transfers files over the Internet should be called 'File Transfer Protocol' and is equally stupid.
Re:Let's avoid being ANAL (Score:2)

by SuiteSisterMary ( 123932 ) writes:

So if I tell you to send me a file using 'file transfer protocol' you're going to assume that I mean good old RFCd FTP, port 21, all that crap? Or are you going to assume I mean HTTP? ICQ file send? MSN file send? SMTP MIME? scp? rcp? Zmodem over telnet? The poster above IS saying that Voice Over IP, the protocol used in all sorts of hardware and software, is NO different from 'voice over IP', or any form of protocol involving TCP/IP and transmission of voice. By his defintion, VoIP could involve FTPing sound files. VoIP is a specific technology, with a specific implementation. Hell, Linksys's new Cable/DSL router has a VoIP option; plug a regular phone into the assigned jack, and tell people to point their VoIP software at the router's IP address, and if somebody calls, it rings the phone.
Have you considered... (Score:2)

by Rei ( 128717 ) writes:

Have you considered taking a gpl'ed mp3 encoder and ripping out some of its (slow) frequency masking and hearing sensitivity calculations? That would take almost no effort. You'll take a quality/size hit (unimportant for speech), but it should be fast enough for real-time encoding without a significant CPU signature; then all you have to deal with is streaming. Also, given that this is speach we're dealing with, you can expect most of the frequencies in question to *not* be masking each other out or be outside the human hearing range (what point would there be to humans being able to make sounds like that? ;) ).

- Karen
Re:[OT] FFT. (Score:2)

by Rei ( 128717 ) writes:

First, the basics behind the FFT. The original concept behind fourier transformations was, of course, to break a signal down into component waveforms. Well, unfortunately, when you look at a cosine, it's symmetrical. A sine is anti-symmetrical. Thus, using only cosines, you could only represent symmetrical data, and with sines, anti-symmetrical data. So, the FFT breaks down a signal into both a sine *and* cosine, summed together (they use a really neat math trick involving imaginary exponents to do this :) ). The only problem with this for compression is that you end up with twice the amount of data you started with!

The DCT and MDCT are transforms which use a waveform which is not symmetrical or antisymmetrical, and thus can represent an arbitrary signal. Thus, with a DCT or MDCT, you get the same amount of data back as you put in (albeit, the data you get back is in floating point format, so you still will lose a bit just simply by storing it back into a integer representation via quantization, which is slightly lossy), but it works :)

- Rei
Re:[OT] FFT. (Score:2)

by Rei ( 128717 ) writes:

Hehe, sorry :) I figured since you asked about why the FFT isn't good for signal compression, you didn't know how to take them :)

The real issue is that you can't just throw away half of the data. If all you care about is magnitudes, then yes, you can take the magnitude of the energies for a given frequency. However, if you want to reproduce the original signal, you can't. The existance of the separate components - real and imaginary - store locational data. If you throw away where the peaks of the waves are located, in audio, cancellations with other waves don't work right, and some other problems. In images, its even worse, as data doesn't appear at all like it should (locations are wrong, intensities, etc).

I've never implemented the MDCT before (my library, when I last worked on it, covered DFTs, FFTs, DCTs, and wavelets (Haar and Daubiches), all of those in 1, 2, and 3d). I've only read a summary of it, but from what I read, it sounded like it works as follows: you start summing products from before the start of your block, and continue summing products past the end of your block (by 50%), normalizing appropriately. But, you only compute enough frequencies, using this, to be equal to the number of elements in your block. Now, this has the net effect of giving data that's not as accurate for your particular block; however, when you reproduce the signal, since every location has an overlap, when you average the overlapped sections together, the errors cancel out (an error in one direction on the MDCT corresponds to an error in the opposite direction in the adjacent block). It helps remove blocking artifacts while still keeping optimally-sized blocks for compression.

- Rei

P.S. - I responded to your post :) Short version: Not as much antimatter is needed as you think, and I dug up some solar wind energy levels, and found that, while the average location's energy level is quite low, energy levels at certain locations are as high as 102 GeV... while that's no 900 GeV like at FermiLab's best or 20 TeV like the supercolider would have produced, when FermiLab is producing antimatter they generally do so at 120 GeV - not too far off :) )
Re:Have you considered... (Score:3)

by Rei ( 128717 ) writes: on Tuesday June 12, 2001 @07:42AM (#159548) Homepage

I'll step in and answer some of those questions for you :)

First, before we can go into the FFT, we need to discuss the DCT (Discrete Fourier Transform). Of course, the purpose of the DFT is to break down a signal into component waveforms - but how? Well, picture that you have a waveform that is just a sine, lets say, 5 hz across the area you're looking at. Now, picture multiplying that waveform, at every location, times a sine waveform of zero hz, summing those together, and normalizing. What do you get? You get 0. :) Well, try it with a 1hz signal. You still get 0. Keep trying till you get to multiplying it times a 5hz signal. All of the sudden, you'll notice that wherever the signal is positive, it gets multipled by a positive value, and wherever it is negative, it gets multiplied by a negative value (making it positive) - so you no longer get 0 as your result. For everything over 5 hz, you'll get 0 again. You just broke down a simple waveform into a sinous component. :)

Now, as was discussed in another reply on this thread, to represent *any* signal, you can't just use a sine at even mhz boundaries and steps. For that, you have to use the sum of a sine and cosine. Due to a trick you can use involving imaginary exponents, you can get one part of the data to come out "real" and the other part "imaginary", and correspond to the individual components of your signal. But, since you get twice as much data back (real and imaginary components), it is a poor choice for compression. DCTs and MDCTs are briefly discussed in the other reply as well. The main difference between a DCT and MDCT is for use in block transforms. Also, the difference between a DFT and a FFT is that, in a DFT, you'll find that you're doing a lot of the same calculations multiple times, due to various properties of sines and cosines. FFTs are basicly a re-ordering of the calculations so that redundant ones are done (less often) (optimally, once). There are several different FFT algorithms.

Block transforms are to get rid of a nasty side effect of transformed data. If a signal exists in one part of the data, but not another, frequency decomposition has trouble dealing with this. It generally causes "ripples" of energy to appear (the goal of doing transforms, compression-wise, is to concentrate the energy in certain frequencies, and then store them - so, ripples are bad). If you look at a very large sample, many frequencies will start and stop. So, you break it down into blocks - if there's a start or stop of a frequency, it only causes ripples in that section.

This works fine until you start throwing away data on a DCT. Because different data will get thrown away in different blocks, while they'll have the same overall level of quality, there will still be discontinuities between the blocks. The MDCT effectively halves the block size and vastly reduces discontinuities by including an overlapped area in its calculations.

Before I can discuss quantization, you first have to understand thresholding and the principles of compressing a transformed signal, which were briefly discussed in my original post. After you transform the signal, ideally, your energy is concentrated in specific frequencies. The effect is something like a starburst in the upper left-hand corner of each block that was transformed. Generally, you will still have *some* energy in weak frequencies, but not much. So, you kill them off - generally with a threshold that varies over the human hearing range, in the case of audio. Also in audio, you generally want to take masking effects into account when killing off weak signals. Once your energy is left in strong signals, you need to store how strong. However, while your input signal might have been composed of 8 or 16 bit integers, your output data will generally be high decimal-resolution floating point values. You need to get it back into integers. This is known as quantization. Some schemes simply convert the data back linearly. Some create a table of arbitrary endpoints for what-converts-to-what. Some use a smooth function. There is a lot of debate over what is the best method. I personally recommend, in this case, after seing the tiny gains made by various other quantization methods, for a huge cpu/complexity cost, using linear quantization.

Huffman encoding is typically used to losslessly encode the quantized data. Huffman encoding has proved attractive because they already know what sort of tree they can build to compress the data well. However, I feel that using arithmatic encoding can give *huge* advantages, via frequency preduction. Because, not only do you know the signal density for a given location for an arbitrary signal, you know what it has been like for this *particular* signal in the past, and can scale your probabilities appropriately. (oh, btw, if you want info on how to use huffman or arithmatic encoding, just ask).

Anyways, I better get back to work. Ciao!

- Rei

Share
twitter facebook
Re:Have you considered... (Score:3)

by Rei ( 128717 ) writes: on Tuesday June 12, 2001 @08:13AM (#159549) Homepage

I just did a quick check on human voice ranges. Vowel sounds contain notable power at frequencies as low as 50 Hz. The sibilants and fricatives, s and f, contain significant amounts of power at frequencies as high as 8,000 Hz. Frequencies above 1,000 Hz contribute the most intelligibility to speech, even though only 16% of the energy lies above that frequency (Chapanis, 1996). Using such an arbitrary method of thresholding as truncation is *far* from optimal in signal compression. Perhaps it has some advantages in analog signal transmission - I don't deal much with that, and that seems to be what you deal with, so perhaps their are applications for it there - but when it comes to encoding digital signals for compression with your goal being the clarity vs. size ratio, that is a very poor method. A quick look at human hearing research shows that humans have vastly different sensitivies over different ranges, and just "throwing away" data outside of certain ranges regardless of signal intensity, and keeping data within those ranges regardless of how weak it is, will ruin your clarity vs. size ratio. That's why you won't see a single, good encoder, whos target is speech or music, that does this. Please, if you can present an example to the contrary, please do so :).

- Rei

P.S. - for those who care, the cpu cost of having a varying threshold over various ranges, instead of a constant one, is negligable compared to the time it takes to do the MDCT, quantize, encode, etc.

P.P.S - Any specific URLs from those organizations I should check out? I'm always looking for a good distraction from work ;)

Share
twitter facebook
Re:[OT] FFT. (Score:3)

by Rei ( 128717 ) writes: on Wednesday June 13, 2001 @08:55AM (#159550) Homepage

Actually, that isn't quite true. "All signals that come out of my microphone are real in the time domain so the frequency domain spectrum is symmetric". If you'll look at a raw audio file, you'll notice that it indeed not symmetric. I'm not sure how familiar you are with signal transforms, so I'll back up a bit. "real" and "imaginary" components are really just mathematical tricks, based on the property that
e^(X*i) = cos(X) + i*sin(X) (or is it the other way around?). The 'i' merely acts as a placeholder, it doesn't actually mean that the frequencies themselves are imaginary. By using exponential math, we can simply add to multiply.

A sine which is contained completely in a rtain frequency range, like a cosine, cannot store phase information - it requires both of them. Now, of course, you can extend the waveform in question so that it isn't completely contained in a certain frequency range - but that is no longer an FFT, but a DCT.

FFTs are useful because they evenly separate signals, and are quite fast. By computing the magnitude of a certain frequency's complex component, you can do windowing quite nicely to tell where your signals are. But, this magnitude alone is not enough to accurately reproduce the original signal with phase information. And, without phase information, cancellation effects can be very bad in the worst case, in fact, to the point of completely messing up your block.

Your example was really a DCT, but using sines instead of cosines :)

- Rei

Share
twitter facebook
Re:Have you considered... (Score:5)

by Rei ( 128717 ) writes: on Monday June 11, 2001 @07:03PM (#159551) Homepage

Sorry, some of us have lives on the weekends ;) My humblest apologies for having better things to do on a saturday night than to debate gun control with someone who loves to continue to dodge the simple question, "Do you think the US legal system is wrong 50 times as often as it is right?", by making pseudo-statistical arguments without overcoming the sheer numbers, and nitpick choices of examples in arguments without hitting the core of them (I.e., "just because something else is worse doesn't justify something bad"). But, regardless, this thread is about audio compression :)

When you do a block DCT or MDCT on an audio signal, you're not looking at a whole page of text's worth. You generally look at a fraction of a second. Speech has little redundancy at this level. However, that isn't what I was referring to. Do you have any background in audio compression? There are two keys in compressing audio using current methods: Frequency masking and signal response. Frequency masking is the fact that when the human ear hears a strong signal, weak signals that are near it in frequency seem to "dissapear" or "merge" with the stronger signal. Signal response (hearing response, frequency sensitivity, etc), is how good, overall, the human mind/ears are at hearing weak signals at various frequencies across the spectrum. By a careful knowledge of these, in music or voice, you can kill off many more frequencies than without it. However, it also is a big CPU consumer to do it very carefully. Cutting out some of the analysis can save you a good bit of CPU - and, in the case of human voice, which tends to be in a very audible range with few masking effects, won't affect your compression rate much.

Second, please, if you can create a good sounding speech synth - especially one that can give inflections, emotion, etc - please, please share it with us. Until then, good luck having something like this work (simply neglecting CPU issues) without sounding like a 50s robot that messes up once a second.

Oh, and to answer your theory about masking out frequencies below 500hz and above 3000hz: No. That will sound so unbelievably awful. First off, lets neglect the fact that someone with a voice like Barry White would be inaudible, and that you'd never hear a 't' or 's'. Ignoring that, that's a silly way to do it. You need a simple curve, even just a simple line graph. It takes little CPU time, and will actually be able to reproduce the original sound well. Arbitrary truncation points are unbearably bad.

Next, you seem to be of the notion that MP3 encoders are "tweaked towards music". MP3 encoding is a fairly abitrary term. MP3 is a specific format for encoding streamable, quantized, transformed data. You can use any truncation scheme you want - even the silly one you proposed. Most encoders you'll find are tweaked towards the human hearing range - an optimal choice for both voice and music (especially voice, though! Voice compresses very well, because, compared to music, it has most of its energy concentrated in a few signals at any given time).

Next, why use "logarithmic encoding" for compression? Logarithmic encoding is a (poor) way to store raw (uncompressed) audio data - it sacrifices low-level clarity for the ability to represent very loud signals - something seldom of use in normal audio compression applications (have you ever noticed how quiet signals on an 8-bit sound card are very crackly, but the loud ones are clear? Thats the sort of effect logarithmic encoding gives to sound). It is useful in efficient Pulse Code Modulation (PCM) of data for maximizing the number of transmissions over a small number of physical channels, but doesn't even begin to apply as far as storing quantized data is concerned (that would be like using a bubble sort to compute Pi or something ;) ). Do you mean "arithmatic encoding", perhaps? I have a neat theory for using that, with scaled probabilities, to create an optimal compression ratio (predictive) for thresholded, quantized data, that I came up with after the last slashdot conversation on compression (it was video compression then).

Please... if you're qualified to discuss audio compression, how about the basics? Do you know how to compute an FFT? Do you know why you wouldn't use an FFT for audio or video compression? What about a DCT? MDCT? What do you know about quanization schemes? The advantages/disadvantages to storing quantized data with huffman encoding vs. arithmatic encoding? Have you ever written a single signal processing function? (I've written a whole library). Do you know anything about the subject at all?

If you don't know what you're talking about, please don't be suggesting encoding schemes. There are enough bad ones out there already ;)

- Rei

P.S. - sorry if I seem a bit bitchy. For some reason, they decided to leave us without air conditioning today at work ;)

Share
twitter facebook
Vocal and SIP (Score:2)

by cullenfluffyjennings ( 138377 ) writes:

You could just hack your own protocol to do it Why not do it the open standards based way? Use SIP - it is good for voice and there is substantial work to build instant messaging and presence stuff on it. Someone, sipforum I think, showed Quake with SIP. I have worked on the SIP stuff at www.vovida.org and it is open source.
Will this matter once games begin including voice? (Score:2)

by fetta ( 141344 ) writes:

Will this matter once games begin including voice as a feature within the game? Tribes 2, for example, includes its own built in voice communication system.

Granted, external programs like GameVoice and Roger Wilco offer some additional feature, but will the average gamer care? I would expect every game that has a team play element to include its own voice technology within a year or two.
Re:Let's avoid being ANAL (Score:2)

by Erasmus Darwin ( 183180 ) writes:

So, VoIP is a specific protocol, 'voice over IP' is a technology application with many potential specific solutions.
I'd debate this point. First, the article text specificially said "Voice over IP" (with voice being capitalized for no other reason). That, to me, implies a proper name rather than a general technology label, not unlike if someone where to say "File Transfer Protocol".
Second, the phrase "Voice over IP" is frequently used when referring to (what you call) "VoIP". For some reason, you seem to use "file transfer protocol" (FTP) versus "a file transfer protocol" (anything to move a file), but create an artificially more relaxed standard where the expansion of "VoIP" is the generic term. This simply isn't the case.
Also, while I hesitate to use pointing out popular usage as a means of winning this argument (and I'm sure it'll come back and bite me when the next cracker vs. hacker argument pops up), a quick google search on "voice over IP" appears to turn up links that're all about VoIP, rather than "use the Internet to talk to your friends". Again, while I'm loathe to normally invoke the popular definition, it's worth pointing out that it seems to coincide with the technical one, providing a much more compelling case.
Overall, I suspect your problem is that you've fixated on the fact that "voice over IP" is an overly broad term that, when analyzed, could apply to a broader set of items than what it's actually intended for. On the other hand, we've already got the afore-mentioned example of "file transfer protocol", we know that a "personal video recorder" is a TiVo-like device rather than just any video recorder owned by an individual, we know that a "television" (literally: far seeing) isn't just any device that lets you see far away -- telescopes and binoculars certainly aren't part of that group. When it comes down to it, a name is really just an arbitrary designator that just happens to usually have some relevance. If you want names that're complete descriptions, you might want to switch from English to German.
Searching freshmeat still seems to work ... (Score:2)

by Tyndareos ( 206375 ) writes:

http://lumumba.luc.ac.be/jori/jvoiplib/jvoiplib.ht ml [luc.ac.be].

Don't know if it's really what the person is looking for, but it's worth a shot.
Re:Responsibility of Game Publishers (Score:2)

by Moridineas ( 213502 ) writes:

I cited Carmack as a developer who is into linux. Look at other prominent developers, such as Sweeney--they don't care nearly as much. Nor do most game developers i think. MOST people don't need or want to have to have an opinion in the OS wars. It's enough to make money and do what works.

Whether the constraints of multiplatformism are benefitial? Most likely (looking at it structurally, and assuming that at least ONE profitable and worthwhile port will be made). However the platform that makes up for this cost deficit, ISN'T *nix.

The "bottlenecks" are managers--I guess you are correctin saying this, though i wouldn't use the term bottleneck. Programmers create a product, managers make money.

Scott
Re:Team up w/ Apple (OS X) - MORE BUYERS! (Score:2)

by infiniti99 ( 219973 ) writes:

I totally agree. This even makes sense when combined with my other post in this thread. MacOS has a fairly low market share, just like Linux, but it has a better reputation as a desktop OS and so it gets lots of good commercial applications and games. No doubt these developers (like Adobe, Blizzard, etc) are interested in easy crossplatform development.

With SDL and OpenGL working on Mac, and Qt on its way there, companies may be very interested in utilizing these libraries/technologies for crossplatform Windows/Mac development. The kicker though is that these are also supported fully on Linux. If these libraries catch on in the Mac world, I'm fairly sure Linux will see lots of the same apps merely by side-effect.

-Justin
Re:You forget cost/profit analysis (Score:2)

by infiniti99 ( 219973 ) writes:

Excellent, I was hoping to get a reply from a developer.

1) Do you use Linux much? Have you used it for any game/non-game development? It is definitely capable of running games. Windows is only worthy of games because of the huge marketshare, not because it is a better gaming platform. It is also obvious that the Linux community wants games. Are you part of that community? Do you not agree?

2) Still, using OpenGL would make a Mac port easier. Choosing OpenGL over DirectX is mainly a portability decision. Do you not want your game to run on Mac either?

3) True. Any game worth its salt has its own GUI library. I meant to bring up Qt as a reference to application developers, broadening my argument for crossplatform programming to include all types of software. I must say Qt may be a good in-house tool at a game company though, for developing map editors, game editors, etc if you have a heterogenous development environment. It may encourage a crossplatform mindset in your company as well. Do any of your developers wish they could be working in Linux?

I understand what you mean though. It all comes down to whether or not you consider portability important, and whether or not it would turn a profit.
Re:You forget cost/profit analysis (Score:3)

by infiniti99 ( 219973 ) writes: <justin@affinix.com> on Monday June 11, 2001 @04:19PM (#159573) Homepage

That's why there is SDL [libsdl.org]. It uses DirectX on Windows and DRI on X (as well as many other graphics layers / OS's).

I think the problem with Windows developers in general is that they don't think of coding crossplatform in the first place. It's easy to understand why: they are taught DirectX and MFC, and Windows has a huge percent of the desktop market. Also, some games are coded so horribly (compare the duct-tape-that-is-EverQuest to any Blizzard product) that porting certain games look like they would be a nightmare.

On the other hand, I think Linux developers are more trained to code portably. With all the unix flavors out there, source portability is already a must. It also seems that these developers care about porting to Windows. Many apps for X are available on Windows (like a lot of the Gtk stuff), but not the other way around.

So Linux developers actually care about portability, but Windows developers do not. Maybe we can convince them to change their ways?

Surely the Windows developers out there don't thoroughly enjoy Windows-only programming, do they? I've used DirectX, and it was ludicrous. It isn't direct at all (Come on, DirectMusic? DirectPlay? Direct is just a buzzword..) and the classes are a mess. I haven't heard much good about MFC either, but I've heard only good about Qt [trolltech.com] (and I've used both).

Qt works on Windows. There's no reason to use MFC. Yes it does cost money, but aren't we talking about real game companies here? SDL works on Windows. There's no reason to use DirectX "directly" (whatever that means). You know how long it would take to port Windows apps/games to Linux that were all written in Qt and SDL? All of a recompile.

Share
twitter facebook
List of VoIP applications (Score:2)

by leonia ( 246522 ) writes:

There is a pretty comprehensive list of VoIP and multimedia applications at http://www.cs.columbia.edu/~hgs/rtp [columbia.edu] and http://www.cs.columbia.edu/sip [columbia.edu] (under 'implementations') These applications are all interoperable, at either the media or call set up level. Among others, our sipc application runs on Linux (and Solaris, FreeBSD, etc.).
Re:sure it's been done ... (Score:2)

by geomcbay ( 263540 ) writes:

Well Microsoft's added voice communications into DirectX, so each game developer for Windows won't have to recreate this stuff, they'll just code some generic interfaces into the API.
For the same sort of thing, but cross-platform, there's HawkNL [hawksoft.com]
HawkNL (Score:3)

by geomcbay ( 263540 ) writes: on Monday June 11, 2001 @03:01PM (#159582)

HawkNL [hawksoft.com] is a nice LPGL library (currently with Win and Linux support) for doing, among other things, voice over IP.
Its targeted at game programmers, to be integrated in-game, as a cross-platform alternative to Microsoft's DirectPlay and DirectPlay voice, but could be used to do a stand-alone VOIP app as well (though I am not aware of any currently).

Share
twitter facebook
Just a little effort... (Score:2)

by Bonkers54 ( 416354 ) writes:

With a mere 15 seconds of research I found a wonderful library here [luc.ac.be] that would very easily be utilized to make a VoIP app for games. All the work is done for you just about, now go finish the rest. This library will even work on both windows and linux, so you could make your program cross platform! Sometimes all it takes is hitting the search button to find what you need.
-Matt
easy problem--easy solution (Score:2)

by Not A Democrat ( 448542 ) writes:

Well, voip isn't very complicated. If you're unsatisfied with the currently available solutions like those available at LinuxTelephony [linuxtelephony.org] or OpenPhone [openphone.org] you can always roll your own. I know I'd lay down a couple bucks for a satisfactory one, so you can probably sell it and make a couple quick bucks from your effort, too. Stop whining and make a difference!
Vovida (Score:2)

by blang ( 450736 ) writes:

Is an opensource SIP stack. It's an overkill if you just want to chat during games, but it's there.
Responsibility of Game Publishers (Score:2)

by idonotexist ( 450877 ) writes:

To the point: game publishers need to get off their arse and provide a linux version for each release.

This is the only way I can see we can solve the problem of not having the "latest and greatest" windows games on Linux. Also, by the time one completes porting a game from Windows to Linux, it is likely the game is passe (except maybe for halflife (the game will not die)).

Is there a petition I can sign? A list of game publisher email addresses to send an email to? I think part of the problem is that game publishers do not see a demand on Linux platforms. Perhaps if publishers were communicated a significant interest, then they should at least think about providing for Linux versions.
You forget cost/profit analysis (Score:4)

by ColGraff ( 454761 ) writes: <maron1&mindspring,com> on Monday June 11, 2001 @03:22PM (#159602) Homepage Journal

Most modern computer games uses Direct3D and DirectX a great deal. These libraries are not portable, and they are what most developers have experience with. You are asking, in essence, for developers to either have two devteams working in parellel, or one team programming both versions. In either case, the game company can either completely rewrite the game engine for each OS, or it can create one highly portable engine. The problem with the latter option is that DirectX in particular really is the backbone of Windows gaming. It would be very hard to convince developers to give it up, and I'm not sure "DirectX free" windows games could match the perfomance of their windows-only counterparts.

Alos, have you considered the expense of training all these developers in Linux? Remember, most of them do not have Linux experience.

Finally, when you consider that Windows controls 90something percent of the desktop gamer market, it just doesn't make sense for a company to pour massive resources into developing Linux and Windows games simultaneously that only a relatively small number of people would buy. At least a dedicated porting company like Loki doesn't have to worry about graphic artists, level designers, story writers, or game design as a whole.

Share
twitter facebook

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Speak Freely (Score:1)

Re:VoIP in Java (Score:1)

Re:You forget cost/profit analysis (Score:1)

Re:Responsibility of Game Publishers (Score:1)

Re:VoIP in Java (Score:1)

Try these out (Score:3)

Re:Speak Freely for Unix (Score:1)

H.323 (Score:2)

Re:Responsibility of Game Publishers (Score:1)

Re:Let's avoid being ANAL (Score:1)

Re:Have you considered... (Score:1)

Game publishers need $ figures (Score:1)

Re:Compression too! (Score:1)

Re:VoIP in Java (Score:2)

Team up w/ Apple (OS X) - MORE BUYERS! (Score:1)

VoIP in Java (Score:4)

Re:You forget cost/profit analysis (Score:2)

Summary of current market conditions: (Score:2)

Re:Let's avoid being ANAL (Score:1)

Re:Let's avoid being ANAL (Score:1)

Re:Wow! (Score:1)

Re:[OT] FFT. (BTW) (Score:1)

[OT] FFT. (Score:2)

Re:[OT] FFT. (Score:2)

Re:[OT] FFT. (Score:2)

Re:[OT] FFT. (Score:2)

Tribes2 uses GSM (Score:5)

Re:sure it's been done ... (Score:2)

Re:*nix based Voice over IP is easy! (Score:1)

Add voice to old games with viavoice (Score:2)

Re:Let's avoid being ANAL (Score:1)

VOCAL (Score:2)

Wow! (Score:3)

Let's avoid confusion.. (Score:5)

Re:Will this matter once games begin including voi (Score:2)

Re:Speak Freely for Unix (Score:1)

Re:Have you considered... (Score:2)

Re:Have you considered... (Score:2)

Re:Have you considered... (Score:2)

Re:Impressive (Score:2)

Re:sure it's been done ... (Score:2)

Signalling + transport (Score:1)

Re:What about Macintosh? (Score:2)

UCL Conferencing tools (Score:1)

MacOS X could help. (Score:1)

Compression too! (Score:3)

suggestions (Score:2)

Re:Tribes2 uses GSM (Score:2)

Re:Have you considered... (Score:2)

Re:sure it's been done ... (Score:2)

Re:Tribes2 uses GSM (Score:1)

Re:Speak Freely for Unix (Score:1)

Re:HawkNL (Score:1)

Re:Will this matter once games begin including voi (Score:1)

Re:Responsibility of Game Publishers (Score:2)

Re:waste (Score:1)

*nix based Voice over IP is easy! (Score:5)

Re:Responsibility of Game Publishers (Score:1)

Re:Back up those assertions, please (Score:1)

Re:Back up those assertions, please (Score:1)

Re:Speak Freely for Unix (Score:2)

Re:Speak Freely for Unix (Score:2)

Re:Let's make one... (Score:4)

For Half-Life engine fans... (Score:2)

Half-Life 1.1.0.7 will have it (Score:1)

Re:Let's avoid being ANAL (Score:1)

Re:Open H.323 (Score:2)

"...to any Blizzard product..." (Score:1)

Tribes 2 Seems to Incorporate One (Score:2)

Speak Freely for Unix (Score:5)

Freshmeat (Score:3)

Re:VoIP in Java (Score:1)

There was an IP phone in the perl journal recently (Score:2)

Let's make one... (Score:1)

sure it's been done ... (Score:2)

Re:Let's avoid being ANAL (Score:2)

Re:Let's avoid being ANAL (Score:2)

Have you considered... (Score:2)

Re:[OT] FFT. (Score:2)

Re:[OT] FFT. (Score:2)